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Preface 


This book provides an introduction to Lie groups, Lie algebras, and repre- 
sentation theory, aimed at graduate students in mathematics and physics. 
Although there are already several excellent books that cover many of the 
same topics, this book has two distinctive features that I hope will make it a 
useful addition to the literature. First, it treats Lie groups (not just Lie alge- 
bras) in a way that minimizes the amount of manifold theory needed. Thus, 
I neither assume a prior course on differentiable manifolds nor provide a con- 
densed such course in the beginning chapters. Second, this book provides a 
gentle introduction to the machinery of semisimple groups and Lie algebras by 
treating the representation theory of SU(2) and SU(3) in detail before going 
to the general case. This allows the reader to see roots, weights, and the Wey] 
group “in action” in simple cases before confronting the general theory. 

The standard books on Lie theory begin immediately with the general case: 
a smooth manifold that is also a group. The Lie algebra is then defined as the 
space of left-invariant vector fields and the exponential mapping is defined in 
terms of the flow along such vector fields. This approach is undoubtedly the 
right one in the long run, but it is rather abstract for a reader encountering 
such things for the first time. Furthermore, with this approach, one must either 
assume the reader is familiar with the theory of differentiable manifolds (which 
rules out a substantial part of one’s audience) or one must spend considerable 
time at the beginning of the book explaining this theory (in which case, it 
takes a long time to get to Lie theory proper). 

My way out of this dilemma is to consider only matrix groups (i.e., closed 
subgroups of GL(n;C)). (Others before me have taken such an approach, as 
discussed later.) Every such group is a Lie group, and although not every Lie 
group is of this form, most of the interesting examples are. The exponential 
of a matrix is then defined by the usual power series, and the Lie algebra g of 
a closed subgroup G of GL(n; C) is defined to be the set of matrices X such 
that exp(tX) lies in G for all real numbers t. One can show that g is, indeed, 
a Lie algebra (i.e., a vector space and closed under commutators). The usual 
elementary results can all be proved from this point of view: the image of the 
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exponential mapping contains a neighborhood of the identity; in a connected 
group, every element is a product of exponentials; every continuous group 
homomorphism induces a Lie algebra homomorphism. (These results show 
that every matrix group is a smooth embedded submanifold of GL(n; C), and 
hence a Lie group.) 

I also address two deeper results: that in the simply-connected case, every 
Lie algebra homomorphism induces a group homomorphism and that there 
is a one-to-one correspondence between subalgebras h of g and connected Lie 
subgroups H of G. The usual approach to these theorems makes use of the 
Frobenius theorem. Although this is a fundamental result in analysis, it is 
not easily stated (let alone proved) and it is not especially Lie-theoretic. My 
approach is to use, instead, the Baker-Campbell—Hausdorff theorem. This 
theorem is more elementary than the Frobenius theorem and arguably gives 
more intuition as to why the above-mentioned results are true. I begin with the 
technically simpler case of the Heisenberg group (where the Baker-Campbell- 
Hausdorff series terminates after the first commutator term) and then proceed 
to the general case. 

Appendix C gives two examples of Lie groups that are not matrix Lie 
groups. Both examples are constructed from matrix Lie groups: One is the 
universal cover of SL(n;R) and the other is the quotient of the Heisenberg 
group by a discrete central subgroup. These examples show the limitations of 
working with matrix Lie groups, namely that important operations such as the 
of taking quotients and covers do not preserves the class of matrix Lie groups. 
In the long run, then, the theory of matrix Lie groups is not an acceptable 
substitute for general Lie group theory. Nevertheless, I feel that the matrix 
approach is suitable for a first course in the subject not only because most of 
the interesting examples of Lie groups are matrix groups but also because all 
of the theorems I will discuss for the matrix case continue to hold for general 
Lie groups. In fact, most of the proofs are the same in the general case, except 
that in the general case, one needs to spend a lot more time setting up the 
basic notions before one can begin. 

In addressing the theory of semisimple groups and Lie algebras, I use repre- 
sentation theory as a motivation for the structure theory. In particular, I work 
out in detail the representation theory of SU(2) (or, equivalently, sl(2;C)) and 
SU(3) (or, equivalently, sI(3;C)) before turning to the general semisimple case. 
The sl(3;C) case (more so than just the sl(2;C) case) illustrates in a concrete 
way the significance of the Cartan subalgebra, the roots, the weights, and the 
Weyl group. In the general semisimple case, I keep the representation theory 
at the fore, introducing at first only as much structure as needed to state the 
theorem of the highest weight. I then turn to a more detailed look at root 
systems, including two- and three-dimensional examples, Dynkin diagrams, 
and a discussion (without proof) of the classification. This portion of the text 
includes numerous images of the relevant structures (root systems, lattices of 
dominant integral elements, and weight diagrams) in ranks two and three. 
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I take full advantage, in treating the semisimple theory, of the correspon- 
dence established earlier between the representations of a simply-connected 
group and the representations of its Lie algebra. So, although I treat things 
from the point of view of complex semisimple Lie algebras, I take advantage of 
the characterization of such algebras as ones isomorphic to the complexifica- 
tion of the Lie algebra of a compact simply-connected Lie group K. (Although, 
for the purposes of this book, we could take this as the definition of a com- 
plex semisimple Lie algebra, it is equivalent to the usual algebraic definition.) 
Having the compact group at our disposal simplifies several issues. First and 
foremost, it implies the complete reducibility of the representations. Second, 
it gives a simple construction of Cartan subalgebras, as the complexification 
of any maximal abelian subalgebra of the Lie algebra of K. Third, it gives a 
more transparent construction of the Weyl group, as W = N(T)/T, where T 
is a maximal torus in K. This description makes it evident, for example, why 
the weights of any representation are invariant under the action of W. Thus, 
my treatment is a mixture of the Lie algebra approach of Humphreys (1972) 
and the compact group approach of Brécker and tom Dieck (1985) or Simon 
(1996). 

This book is intended to supplement rather than replace the standard texts 
on Lie theory. I recommend especially four texts for further reading: the book 
of Lee (2003) for manifold theory and the relationship between Lie groups 
and Lie algebras, the book of Humphreys (1972) for the Lie algebra approach 
to representation theory, the book of Brécker and tom Dieck (1985) for the 
compact-group approach to representation theory, and the book of Fulton and 
Harris (1991) for numerous examples of representations of the classical groups. 
There are, of course, many other books worth consulting; some of these are 
listed in the Bibliography. 

I hope that by keeping the mathematical prerequisites to a minimum, I 
have made this book accessible to students in physics as well as mathematics. 
Although much of the material in the book is widely used in physics, physics 
students are often expected to pick up the material by osmosis. I hope that 
they can benefit from a treatment that is elementary but systematic and 
mathematically precise. In Appendix A, I provide a quick introduction to the 
theory of groups (not necessarily Lie groups), which is not as standard a part 
of the physics curriculum as it is of the mathematics curriculum. 

The main prerequisite for this book is a solid grounding in linear algebra, 
especially eigenvectors and the notion of diagonalizability. A quick review of 
the relevant material is provided in Appendix B. In addition to linear algebra, 
only elementary analysis is needed: limits, derivatives, and an occasional use 
of compactness and the inverse function theorem. 

There are, to my knowledge, five other treatments of Lie theory from the 
matrix group point of view. These are (in order of publication) the book Linear 
Lie Groups, by Hans Freudenthal and H. de Vries, the book Matrix Groups, 
by Morton L. Curtis, the article “Very Basic Lie Theory,” by Roger Howe, 
and the recent books Matriz Groups: An Introduction to Lie Group Theory, 
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by Andrew Baker, and Lie Groups: An Introduction Through Linear Groups, 
by Wulf Rossmann. (All of these are listed in the Bibliography.) The book of 
Freudenthal and de Vries covers a lot of ground, but its unorthodox style and 
notation make it rather inaccessible. The works of Curtis, Howe, and Baker 
overlap considerably, in style and content, with the first two chapters of this 
book, but do not attempt to cover as much ground. For example, none of 
them treats representation theory or the Baker-Campbell—Hausdorff formula. 
The book of Rossmann has many similarities with this book, including the 
use of the Baker-Campbell—Hausdorff formula. However, Rossmann’s book is 
a bit different at the technical level, in that he considers arbitrary subgroups 
of GL(n; C), with no restriction on the topology. 

Although the organization of this book is, I believe, substantially different 
from that of other books on the subject, I make no claim to originality in any 
of the proofs. I myself learned most of the material here from books listed 
in the Bibliography, especially Humphreys (1972), Brécker and tom Dieck 
(1985), and Miller (1972). 

I am grateful to many who made corrections, large and small, to the text 
before publication, including Ed Bueler, Wesley Calvert, Tom Goebeler, Ruth 
Gornet, Keith Hubbard, Wicharn Lewkeeratiyutkul, Jeffrey Mitchell, Ambar 
Sengupta, and Erdinch Tatar. I am grateful as well to those who have pointed 
out errors in the first printing (which have been corrected in this, the second 
printing), including Moshe Adrian, Kamthorn Chailuek, Paul Gibson, Keith 
Hubbard, Dennis Muhonen, Jason Quinn, Rebecca Weber, and Reed Wickner. 

I also thank Paul Hildebrant for assisting with the construction of mod- 
els of rank-three root systems using Zome, Judy Hygema for taking digital 
photographs of the models, and Charles Albrecht for rendering the color im- 
ages. Finally, I especially thank Scott Vorthmann for making available to the 
vZome software and for assisting me in its use. 

I welcome comments by e-mail at bhall@nd.edu. Please visit my web site 
at http://www.nd.edu/~bhall/ for more information, including an up-to-date 
list of corrections and many more color pictures than could be included in the 
book. 


Notre Dame, Indiana Brian C. Hall 
May 2004 
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Part I 


General Theory 


1 


Matrix Lie Groups 


1.1 Definition of a Matrix Lie Group 


We begin with a very important class of groups, the general linear groups. The 
groups we will study in this book will all be subgroups (of a certain sort) of 
one of the general linear groups. This chapter makes use of various standard 
results from linear algebra that are summarized in Appendix B. This chapter 
also assumes basic facts and definitions from the theory of abstract groups; 
the necessary information is provided in Appendix A. 


Definition 1.1. The general linear group over the real numbers, denoted 
GL(n; R), is the group of all n x n invertible matrices with real entries. The 
general linear group over the complex numbers, denoted GL(n; C), is the group 
of all n x n invertible matrices with complex entries. 


The general linear groups are indeed groups under the operation of matrix 
multiplication: The product of two invertible matrices is invertible, the iden- 
tity matrix is an identity for the group, an invertible matrix has (by definition) 
an inverse, and matrix multiplication is associative. 


Definition 1.2. Let M,,(C) denote the space of allnxn matrices with complex 
entries. 


Definition 1.3. Let Am be a sequence of complex matrices in M,(C). We 
say that Am converges to a matric A if each entry of Am converges (as 
m —> oo) to the corresponding entry of A (i.e., if (Am),, converges to Ax for 
all 1 < k,l <n). 


Definition 1.4. A matriz Lie group is any subgroup G of GL(n; C) with the 
following property: If Am is any sequence of matrices in G, and Am converges 
to some matrix A then either A € G, or A is not invertible. 


The condition on G amounts to saying that G is a closed subset of GL(n; C). 
(This does not necessarily mean that G is closed in M,,(C).) Thus, Definition 
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1.4 is equivalent to saying that a matrix Lie group is a closed subgroup of 
GL(n;C). 

The condition that G be a closed subgroup, as opposed to merely a sub- 
group, should be regarded as a technicality, in that most of the interesting 
subgroups of GL(n;C) have this property. (Most of the matrix Lie groups G 
we will consider have the stronger property that if Am is any sequence of 
matrices in G, and Am converges to some matrix A, then A € G (i.e., that G 
is closed in M,,(C)).) 


1.1.1 Counterexamples 


An example of a subgroup of GL(n; C) which is not closed (and hence is not a 
matrix Lie group) is the set of all n x n invertible matrices all of whose entries 
are real and rational. This is in fact a subgroup of GL(n;C), but not a closed 
subgroup. That is, one can (easily) have a sequence of invertible matrices 
with rational entries converging to an invertible matrix with some irrational 
entries. (In fact, every real invertible matrix is the limit of some sequence of 
invertible matrices with rational entries.) 

Another example of a group of matrices which is not a matrix Lie group 
is the following subgroup of GL(2;C). Let a be an irrational real number and 


let P 
e* 0 
a-{(* 2.)leer}. 


Clearly, G is a subgroup of GL(2, C). Because a is irrational, the matrix —J is 
not in G, since to make e*t equal to —1, we must take ¢ to be an odd integer 
multiple of 7, in which case ta cannot be an odd integer multiple of 7. On the 
other hand (Exercise 1), by taking t = (2n + 1)z for a suitably chosen integer 
n, we can make ta arbitrarily close to an odd integer multiple of 7. Hence, 
we can find a sequence of matrices in G which converges to —I, and so G is 
not a matrix Lie group. See Exercise 1 and Exercise 18 for more information. 


1.2 Examples of Matrix Lie Groups 


Mastering the subject of Lie groups involves not only learning the general the- 
ory but also familiarizing oneself with examples. In this section, we introduce 
some of the most important examples of (matrix) Lie groups. 


1.2.1 The general linear groups GL(n;R) and GL(n; C) 


The general linear groups (over R or C) are themselves matrix Lie groups. 
Of course, GL(n; C) is a subgroup of itself. Furthermore, if Am is a sequence 
of matrices in GL(n;C) and Am converges to A, then by the definition of 
GL(n;C), either A is in GL(n;C), or A is not invertible. 
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Moreover, GL(n;R) is a subgroup of GL(n;C), and if Am € GL(n;R) and 
Am converges to A, then the entries of A are real. Thus, either A is not 
invertible or A € GL(n; R). 


1.2.2 The special linear groups SL(n; R) and SL(n; C) 


The special linear group (over R or C) is the group of n x n invertible 
matrices (with real or complex entries) having determinant one. Both of these 
are subgroups of GL(n; C). Furthermore, if A, is a sequence of matrices with 
determinant one and A, converges to A, then A also has determinant one, 
because the determinant is a continuous function. Thus, SL(n; R) and SL (n; C) 
are matrix Lie groups. 


1.2.3 The orthogonal and special orthogonal groups, O(n) and 
SO(n) 


Ann x n real matrix A is said to be orthogonal if the column vectors that 
make up A are orthonormal, that is, if 


n 
XAA =ð 1S 5K <n. 
l=1 


(Here 6,;; is the Kronecker delta, equal to 1 if j = k and equal to zero if j # 
k.) Equivalently, A is orthogonal if it preserves the inner product, namely if 
(x,y) = (Aa, Ay) for all vectors x, y in R”. ( Angled brackets denote the usual 
inner product on R”, (x,y) = J} p eye.) Still another equivalent definition 
is that A is orthogonal if A" A = J, i.e., if A? = A7+. (Here, A*t” is the 
transpose of A, (At) ı = Aik.) See Exercise 2. 

Since det At" = det A, we see that if A is orthogonal, then det(A’ A) = 
(det A)? = det J = 1. Hence, det A = +1, for all orthogonal matrices A. 

This formula tells us in particular that every orthogonal matrix must be 
invertible. However, if A is an orthogonal matrix, then 


(Atx, Aty) = (A(A7*2), A(A71y)) = (x,y). 


Thus, the inverse of an orthogonal matrix is orthogonal. Furthermore, the 
product of two orthogonal matrices is orthogonal, since if A and B both 
preserve inner products, then so does AB. Thus, the set of orthogonal matrices 
forms a group. 

The set of all n x n real orthogonal matrices is the orthogonal group 
O(n), and it is a subgroup of GL(n;C). The limit of a sequence of orthogonal 
matrices is orthogonal, because the relation AtA = I is preserved under 
taking limits. Thus, O(n) is a matrix Lie group. 

The set of n x n orthogonal matrices with determinant one is the special 
orthogonal group SO(n). Clearly, this is a subgroup of O(n), and hence of 


6 1 Matrix Lie Groups 


GL(n; C). Moreover, both orthogonality and the property of having determi- 
nant one are preserved under limits, and so SO(n) is a matrix Lie group. Since 
elements of O(n) already have determinant +1, SO(n) is “half” of O(n). 
Geometrically, elements of O(n) are either rotations or combinations of 
rotations and reflections. The elements of SO(n) are just the rotations. 
See also Exercise 6. 


1.2.4 The unitary and special unitary groups, U(n) and SU(n) 


An n x n complex matrix A is said to be unitary if the column vectors of A 
are orthonormal, that is, if 


Me 


Aij Ak = Ojk- 


T 


1 


Equivalently, A is unitary if it preserves the inner product, namely if (x, y} = 
(Ax, Ay) for all vectors x,y in C”. (Angled brackets here denote the inner 
product on C”, (x,y) = Xp Tryk. We will adopt the convention of putting 
the complex conjugate on the left.) Still another equivalent definition is that 
A is unitary if A*A = I, i.e., if A* = A`1. (Here, A* is the adjoint of A, 
(A*) jx = Arj.) See Exercise 3. 

Since det A* = det A, we see that if A is unitary, then det(A*A) = 
|det Al? = det I = 1. Hence, |det A| = 1, for all unitary matrices A. 

This, in particular, shows that every unitary matrix is invertible. The same 
argument as for the orthogonal group shows that the set of unitary matrices 
forms a group. 

The set of all n x n unitary matrices is the unitary group U(n), and it 
is a subgroup of GL(n; C). The limit of unitary matrices is unitary, so U(n) is 
a matrix Lie group. The set of unitary matrices with determinant one is the 
special unitary group SU(n). It is easy to check that SU(n) is a matrix Lie 
group. Note that a unitary matrix can have determinant e” for any 0, and so 
SU(n) is a smaller subset of U(n) than SO(n) is of O(n). (Specifically, SO(n) 
has the same dimension as O(n), whereas SU(n) has dimension one less than 
that of U(n).) 

See also Exercise 8. 


1.2.5 The complex orthogonal groups, O(n; C) and SO(n; C) 


Consider the bilinear form (-,-) on C” defined by (x, y) = $2, £kYk- This form 
is not an inner product (Section B.6) because, for example, it is symmetric 
rather than conjugate-symmetric. The set of all n x n complex matrices A 
which preserve this form (i.e., such that (Ax, Ay) = (x, y) for all x,y € C”) is 
the complex orthogonal group O(n;C), and it is a subgroup of GL(n; C). 
Repeating the arguments for the case of SO(n) and O(n) (but now permitting 
complex entries), we find that an n x n complex matrix A is in O(n; C) if and 
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only if At A = I, that O(n;C) is a matrix Lie group, and that det A = £1 
for all A in O(n; C). Note that O(n; C) is not the same as the unitary group 
U(n). The group SO(n; C) is defined to be the set of all A in O(n;C) with 
det A = 1 and it is also a matrix Lie group. 


1.2.6 The generalized orthogonal and Lorentz groups 


Let n and k be positive integers, and consider R"**. Define a symmetric 
bilinear form [-,-Jn,, on R"** by the formula 


[x, Y\nk SHY te +H EnYn — En+1Yn+1 — °° Ln+kYn+k (1.1) 


The set of (n + k) x (n + k) real matrices A which preserve this form (i.e., 
such that [Az, Ay],, = [z,y],, for all x,y € R"+*) is the generalized 
orthogonal group O(n; k). It is a subgroup of GL(n + k; R) and a matrix Lie 
group (Exercise 4). 

If A is an (n +k) x (n +k) real matrix, let A® denote the it column 
vector of A, that is, 

Aii 
AM = 
Antki 


Then, A is in O(n; k) if and only if the following conditions are satisfied: 
0 Fj, 


1 1<il<n, (1.2) 
=-1 n+1<l<n+k. 


[ AD, A] 
i AY, AY] 
n, 


[A0,A0] 


’ 


Let g denote the (n +k) x (n + k) diagonal matrix with ones in the first 
n diagonal entries and minus ones in the last k diagonal entries. Then, A is 
in O(n; k) if and only if A'"gA = g (Exercise 4). Taking the determinant of 
this equation gives (det A)? det g = det g, or (det A)? = 1. Thus, for any A in 
O(n;k), det A = +1. 

Of particular interest in physics is the Lorentz group O(3; 1). See also 
Exercise 7. 


1.2.7 The symplectic groups Sp(n;R), Sp(n; C), and Sp(n) 


The special and general linear groups, the orthogonal and unitary groups, and 
the symplectic groups (which will be defined momentarily) make up the clas- 
sical groups. Of the classical groups, the symplectic groups have the most 
confusing definition, partly because there are three sets of them (Sp(n; R), 
Sp(n; C), and Sp(n)) and partly because they involve skew-symmetric bilin- 
ear forms rather than the more familiar symmetric bilinear forms. To further 
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confuse matters, the notation for referring to these groups is not consistent 
from author to author. 
Consider the skew-symmetric bilinear form B on R?” defined as follows: 


Biz, y] = yes — Tn+kYk- (1.3) 
k=1 


The set of all 2n x 2n matrices A which preserve B (i.e., such that B|Ax, Ay] = 
Biz, y] for all x,y € R?”) is the real symplectic group Sp(n; R), and it is 
a subgroup of GL(2n;R). It is not difficult to check that this is a matrix 
Lie group (Exercise 5). This group arises naturally in the study of classical 
mechanics. If J is the 2n x 2n matrix 


07 
a 


then B[z, y] = (x, Jy), and it is possible to check that a 2n x 2n real matrix A is 
in Sp(n; R) if and only if At JA = J. (See Exercise 5.) Taking the determinant 
of this identity gives (det A)? det J = det J, or (det A)? = 1. This shows that 
det A = +1, for all A € Sp(n;R). In fact, det A = 1 for all A € Sp(n;R), 
although this is not obvious. 

One can define a bilinear form on C?” by the same formula (1.3). (This 
form involves no complex conjugates.) The set of 2n x 2n complex matrices 
which preserve this form is the complex symplectic group Sp(n;C). A 
2n x 2n complex matrix A is in Sp(n;C) if and only if AJA = J. (Note: 
This condition involves A‘, not A*.) This relation shows that det A = +1, 
for all A € Sp(n; C). In fact, det A = 1, for all A € Sp(n;C). 

Finally, we have the compact symplectic group Sp(n) defined as 


Sp(n) = Sp (n; C) N U(2n). 


See also Exercise 9. For more information and a proof that det A = 1 for all 
A € Sp(n; C), see Section 9.4 of Miller (1972). What we call Sp (n; C) Miller 
calls Sp(n), and what we call Sp(n), Miller calls USp(n). 


1.2.8 The Heisenberg group H 


The set of all 3 x 3 real matrices A of the form 


lab 
A=|[0lec], (1.4) 
001 


where a, b, and c are arbitrary real numbers, is the Heisenberg group. It is 
easy to check that the product of two matrices of the form (1.4) is again of 
that form, and, clearly, the identity matrix is of the form (1.4). Furthermore, 
direct computation shows that if A is as in (1.4), then 
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1 —a ac — b 
A't=[0 1 -c 
00 1 


Thus, H is a subgroup of GL(3; R). Clearly, the limit of matrices of the form 
(1.4) is again of that form, and so H is a matrix Lie group. 

The reason for the name “Heisenberg group” is that the Lie algebra of 
H gives a realization of the Heisenberg commutation relations of quantum 
mechanics. (See especially Chapter 4, Exercise 8.) 

See also Exercise 10. 


1.2.9 The groups R*, C*, S1, R, and R” 


Several important groups which are not naturally groups of matrices can (and 
will in these notes) be thought of as such. 

The group R* of non-zero real numbers under multiplication is isomorphic 
to GL(1;R). Thus, we will regard R* as a matrix Lie group. Similarly, the 
group C* of nonzero complex numbers under multiplication is isomorphic to 
GL(1;C), and the group S! of complex numbers with absolute value one is 
isomorphic to U(1). 

The group R under addition is isomorphic to GL(1;R)* (1x1 real matrices 
with positive determinant) via the map x — [e”]. The group R” (with vector 
addition) is isomorphic to the group of diagonal real matrices with positive 
diagonal entries, via the map 


e” 0 
(@1,.. in) = 


0 erm 


1.2.10 The Euclidean and Poincaré groups E(n) and P(n;1) 


The Euclidean group E(n) is, by definition, the group of all one-to-one, onto, 
distance-preserving maps of R” to itself, that is, maps f : R” —> R” such that 
d(f(x), f(y)) = d(x,y) for all z,y € R”. Here, d is the usual distance on R”: 
d(x,y) = |x — y|. Note that we do not assume anything about the structure 
of f besides the above properties. In particular, f need not be linear. The 
orthogonal group O(n) is a subgroup of E(n) and is the group of all linear 
distance-preserving maps of R” to itself. For x € R”, define the translation 
by x, denoted Tz, by 


The set of translations is also a subgroup of E(n). 


Proposition 1.5. Every element T of E(n) can be written uniquely as an 
orthogonal linear transformation followed by a translation, that is, in the form 
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T=T,R 
with x € R” and R € O(n). 


We will not prove this. The key step is to prove that every one-to-one, 
onto, distance-preserving map of R” to itself which fixes the origin must be 
linear. We will write an element T = T, R of E(n) as a pair {x, R}. Note that 
for y € R”, 

{x, R}y = Ry+ z 


and that 
{x1, Ri}{re, Roy = Ri (Roy + z2) + xı = Ry Roy + (xı + Rize). 
Thus, the product operation for E(n) is the following: 
{x1, Ri}{xve, Ro} = {x1 + Rive, Ri Ro}. (1.5) 
The inverse of an element of E(n) is given by 
{x, RF = {~R712,R4}. 


As already noted, E(n) is not a subgroup of GL(n; R), since translations are 
not linear maps. However, E(n) is isomorphic to a subgroup of GL(n + 1;R), 
via the map which associates to {x, R} € E(n) the following matrix: 


R ij, (1.6) 


This map is clearly one-to-one, and direct computation shows that multipli- 
cation of elements of the form (1.6) follows the multiplication rule in (1.5), so 
that this map is a homomorphism. Thus, E(n) is isomorphic to the group of 
all matrices of the form (1.6) (with R € O(n)). The limit of things of the form 
(1.6) is again of that form, and so we have expressed the Euclidean group 
E(n) as a matrix Lie group. 

We similarly define the Poincaré group P(n; 1) to be the group of all trans- 
formations of R”+? of the form 


T=T,A 


with x € R”! and A € O(n;1). This is the group of affine transformations 
of R"*+! which preserve the Lorentz “distance” dy (x,y) = (£1 — y1)? +- + 
(En — Yn)? — (fn41 — Yn41)*- (An affine transformation is one of the form 
x — Ax +b, where A is a linear transformation and b is constant.) The group 
product is the obvious analog of the product (1.5) for the Euclidean group. 
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The Poincaré group P(n; 1) is isomorphic to the group of (n +2) x (n +2) 
matrices of the form 
ry 


A (1.7) 
Tn+1 
0---0 1 


with A € O(n; 1). The set of matrices of the form (1.7) is a matrix Lie group. 


1.3 Compactness 


Definition 1.6. A matrix Lie group G is said to be compact if the following 
two conditions are satisfied: 


1. If Am is any sequence of matrices in G, and Am converges to a matriz A, 
then A is in G. 

2. There exists a constant C such that for all A € G, |Aj;| < C for all 
1<ij<n. 


This is not the usual topological definition of compactness. However, the 
set M,,(C) of all n x n complex matrices can be thought of as C”’. The above 
definition says that G is compact if it is a closed, bounded subset of C”. It is 
a standard theorem from elementary analysis that a subset of C” is compact 
if and only if it is closed and bounded. 

All of our examples of matrix Lie groups except GL(n;R) and GL(n; C) 
have property (1). Thus, it is the boundedness condition (2) that is most 
important. 


1.3.1 Examples of compact groups 


The groups O(n) and SO(n) are compact. Property (1) is satisfied because 
the limit of orthogonal matrices is orthogonal and the limit of matrices with 
determinant one has determinant one. Property (2) is satisfied because if A is 
orthogonal, then the column vectors of A have norm one, and hence |Axı| < 1, 
for all 1 < k,l < n. A similar argument shows that U(n), SU(n), and Sp(n) 
are compact. (This includes the unit circle, S! & U(1).) 


1.3.2 Examples of noncompact groups 


All of the other examples given of matrix Lie groups are noncompact. The 
groups GL(n;R) and GL(n;C) violate property (1), since a limit of invertible 
matrices may be noninvertible. The groups SL(n; R) and SL (n; C) violate (2), 
(except in the trivial case n = 1) since 
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has determinant one, no matter how large m is. 

The following groups also violate (2), and hence are noncompact: O(n; C) 
and SO(n; C); O(n; k) and SO(n; k) (n > 1, k > 1); the Heisenberg group H; 
Sp(n;R) and Sp(n;C); E(n) and P(n;1); R and R”; R* and C*. It is left to 
the reader to provide examples to show that this is the case. 


1.4 Connectedness 


Definition 1.7. A matriz Lie group G is said to be connected if given any 
two matrices A and B in G, there exists a continuous path A(t), a < t < b, 
lying in G with A(a) = A and A(b) = B. 


This property is what is called path-connected in topology, which is not 
(in general) the same as connected. However, it is a fact (not particularly 
obvious at the moment) that a matrix Lie group is connected if and only if it 
is path-connected. So, in a slight abuse of terminology, we shall continue to 
refer to the above property as connectedness. (See Section 1.8.) 

A matrix Lie group G which is not connected can be decomposed (uniquely) 
as a union of several pieces, called components, such that two elements of 
the same component can be joined by a continuous path, but two elements of 
different components cannot. 


Proposition 1.8. If G is a matrix Lie group, then the component of G con- 
taining the identity is a subgroup of G. 


Proof. Saying that A and B are both in the component containing the identity 
means that there exist continuous paths A(t) and B(t) with A(0) = B(0) = J, 
A(1) = A, and B(1) = B. Then, A(t)B(t) is a continuous path starting at I 
and ending at AB. Thus, the product of two elements of the identity compo- 
nent is again in the identity component. Furthermore, A(t)~? is a continuous 
path starting at J and ending at A~!, and so the inverse of any element of 
the identity component is again in the identity component. Thus, the identity 
component is a subgroup. o 


Note that because matrix multiplication and matrix inversion are contin- 
uous on GL(n; C), it follows that if A(t) and B(t) are continuous, then so are 
A(t)B(t) and A(t)~!. The continuity of the matrix product is obvious. The 
continuity of the inverse follows from the formula for the inverse in terms 
of cofactors; this formula is continuous as long as we remain in the set of 
invertible matrices where the determinant in the denominator is nonzero. 
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Proposition 1.9. The group GL(n;C) is connected for all n > 1. 


Proof. Consider first the case n = 1. A 1 x 1 invertible complex matrix A is 
of the form A = [A] with À in C*, the set of nonzero complex numbers. Given 
any two nonzero complex numbers, we can easily find a continuous path which 
connects them and does not pass through zero. 

For the case n > 2, we will show that any element of GL(n;C) can be 
connected to the identity by a continuous path lying in GL(n;C). Then, any 
two elements A and B of GL(n;C) can be connected by a path going from A 
to the identity and then from the identity to B. 

We make use of the result that every matrix is similar to an upper tri- 
angular matrix (Theorem B.7). That is, given any n x n complex matrix A, 
there exists an invertible n x n complex matrix C such that 


A=CBC 
where B is upper triangular: 
At * 
B= i 
0 Xn 


If we now assume that A is invertible, then all the A;’s must be nonzero, 
since det A = det B = 1 -++ Àn. Let B(t) be obtained by multiplying the part 
of B above the diagonal by (1 — t), for 0 < t < 1, and let A(t) = CB(t)C™?. 
Then, A(t) is a continuous path which starts at A and ends at CDC~!, where 
D is the diagonal matrix 


This path lies in GL(n; C) since det A(t) = A1---An = det A for all t. 
Now, as in the case n = 1, we can define \;(t), which connects each A; to 1 
in C* as t goes from 1 to 2. Then, we can define A(t) on the interval 1 < t < 2 
by 
Ai (t) 0 
A(t) =C i Cr 
0 An(t) 


This is a continuous path which starts at CDO} when t = 1 and ends at 
I (= CIC~') when t = 2. Since the A,(t)’s are always nonzero, A(t) lies in 
GL(n; C). We see, then, that every matrix A in GL(n;C) can be connected to 
the identity by a continuous path lying in GL(n; C). o 


An alternative proof of this result is given in Exercise 12. 
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Proposition 1.10. The group SL(n;C) is connected for all n > 1. 


Proof. The proof is almost the same as for GL(n; C), except that we must be 
careful to preserve the condition det A = 1. Let A be an arbitrary element 
of SL(n;C). The case n = 1 is trivial, so we assume n > 2. We can define 
A(t) as before for 0 < t < 1, with A(0) = A and A(1) = CDC7™}, since 
det A(t) = det A = 1. Now, define A, (t) as before for 1 < k < n— 1 and define 
An(t) to be [Ai (t): +- An-1(t)] 7+. (Note that since Ay -+ An = 1, An(1) = An-) 
This allows us to connect A to the identity while staying within SL(n;C). O 


Proposition 1.11. The groups U(n) and SU(n) are connected, for alln > 1. 


Proof. By a standard result of linear algebra (Theorem B.3), every unitary 
matrix has an orthonormal basis of eigenvectors, with eigenvalues of the form 
et? It follows that every unitary matrix U can be written as 


et 0 
U =U; F Ue (1.8) 
0 en 
with U; unitary and 0; € R. Conversely, as is easily checked, every matrix of 
the form (1.8) is unitary. Now, define 
et-t)@1 0 


U(t) =U, oe Oia 
0 et(1-t)On 


As t ranges from 0 to 1, this defines a continuous path in U(n) joining U to 
I. Thus, any two elements U and V of U(n) can be connected to each other 
by a continuous path that runs from U to I and then from IJ to V. 

A slight modification of this argument, as in the proof of Proposition 1.10, 
shows that SU(n) is connected. 


Proposition 1.12. The group GL(n;R) is not connected, but has two com- 
ponents. These are GL(n;R)*, the set of n x n real matrices with positive 
determinant, and GL(n;R)~, the set of n x n real matrices with negative de- 
terminant. 


Proof. GL(n;R) cannot be connected, for if det A > 0 and det B < 0, then 
any continuous path connecting A to B would have to include a matrix with 
determinant zero and hence pass outside of GL(n; R). 

The proof that GL(n;R)* is connected is sketched in Exercise 15. Once 
GL(n;R)* is known to be connected, it is not difficult to see that GL(n;R)~ 
is also connected. Let C be any matrix with negative determinant and take 
A and B in GL(n;R)~. Then, C~'A and C~'B are in GL(n;R)* and can be 
joined by a continuous path D(t) in GL(n;R)*+. However, then, CD(t) is a 
continuous path joining A and B in GL(n;R)~. o 
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The following table lists some matrix Lie groups, indicates whether or not 
the group is connected, and gives the number of components: 


Group Connected? Components 


GL(n; C) yes 1 
SL(n; C) yes 1 
GL(n; R) no 2 
SL(n; R) yes 1 
O(n) no 2 
SO(n) yes 1 
U(n) yes 1 
SU(n) yes 1 
O(n; 1) no 4 
SO(n; 1) no 2 
Heisenberg yes 1 
E(n) no 2 
P(n;1) no 4 


Proofs of some of these results are given in Exercises 7, 13, 14, and 15. 


1.5 Simple Connectedness 


Definition 1.13. A matriz Lie group G is said to be simply connected if it 
is connected and, in addition, every loop in G can be shrunk continuously to 
a point in G. 

More precisely, assume that G is connected. Then, G is simply connected 
if given any continuous path A(t), O < t < 1, lying in G with A(0) = A(1), 
there exists a continuous function A(s,t),0< s,t <1, taking values in G and 
having the following properties: (1) A(s,0) = A(s,1) for all s, (2) A(0,t) = 
A(t), and (3) A(1,t) = A(1,0) for all t. 


One should think of A(t) as a loop and A(s,t) as a family of loops, pa- 
rameterized by the variable s which shrinks A(t) to a point. Condition 1 says 
that for each value of the parameter s, we have a loop; condition 2 says that 
when s = 0 the loop is the specified loop A(t); and condition 3 says that when 
s = 1 our loop is a point. 


Proposition 1.14. The group SU(2) is simply connected. 


Proof. Exercise 8 shows that SU(2) may be thought of (topologically) as the 
three-dimensional sphere $? sitting inside R4. It is well known that S3 is 
simply connected. o 


The condition of simple connectedness is extremely important. One of our 
most important theorems will be that if G is simply connected, then there is a 
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natural one-to-one correspondence between the representations of G and the 
representations of its Lie algebra. 

For any path-connected topological space, one can define an object called 
the fundamental group. See Appendix E for more information. A topolog- 
ical space is simply connected if and only if the fundamental group is the 
trivial group {1}. I now provide the following tables of fundamental groups, 
first for compact groups and then for noncompact groups. See Appendix E 
for the methods of proof. Here, SO,(n;1) denotes the identity component of 
SO(n; 1) (since one defines the fundamental group only for connected groups). 
In each entry, the result is understood to apply for all n > 1 unless otherwise 
stated. 


Group Simply connected? Fundamental group 


SO(2) no Z 
SO(n) (n > 3) no Z/2 
U(n) no Z 
SU(n) yes {1} 
Sp(n) yes {1} 
Group Simply connected? Fundamental group 
GL(n;R)t (n > 2) no same as SO(n) 
GL(n; C) no Z 
SL(n; R) (n > 2) no same as SO(n) 
SL(n; C) yes {1} 
SO(n; C) no same as SO(n) 
SOe(1;1) yes {1} 
SOe(n; 1) (n > 2) no same as SO(n) 
Sp(n; R) no Z 
Sp(n; C) yes {1} 


We conclude this section with a discussion of the case of SO(3). If v is a unit 
vector in R, let R, be the element of SO(3) consisting of a “right-handed” 
rotation by angle 8 in the plane perpendicular to v. Here, right-handed means 
that if one places the thumb of one’s right hand in the v-direction, the rotation 
is in the direction that one’s fingers curl. To say this more mathematically, 
let v+ denote the plane perpendicular to v and let us choose an orthonormal 
basis (u1, u2) for v+ in such a way that the basis (u1, u2,v) for R3 has the 
same orientation as the standard basis (e1, e2, e3). (This means that the linear 
map taking (u1, u2,v) to (e1,€2,e3) has positive determinant.) We then use 
the basis (ui, uz) to identify vt with R?, and the rotation is then in the 
counterclockwise direction in R?. 

It is easily seen that R_,» is the same as R, —ọ. It is also not hard to 
show (Exercise 16) that every element of SO(3) can be expressed as Ryo, for 
some v and 0 with —r < 6 < x. Furthermore, we can arrange that 0 <0 < m 
by replacing v with —v if necessary. 

Now let B denote the closed ball of radius r in R and consider the map 
® : B > SO(3) given by 
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(u) = Raup uF, 
(0) =T. 


Here, & = u/||u|| is the unit vector in the u-direction. The map ® is con- 
tinuous, even at I, since R, approaches the identity as 0 approaches zero, 
regardless of how v is behaving. The discussion in the preceding paragraph 
shows that ® maps B onto RÌ. The map ® is almost injective, but not quite. 
Since Ryn = R_y,,, antipodal points on the boundary of B (i.e., pairs of 
points of the form (u, —u) with ||u|| = 7) map to the same element of SO(3). 

This means that SO(3) can be identified (homeomorphically) with B/~, 
where ~ denotes identification of antipodal points on the boundary. It is known 
that B/~ is not simply connected. Specifically, consider the loop in B/~ that 
begins at some vector u of length m and goes in a straight line through the 
origin until it reaches —u. (Since u and —u are identified, this is a loop in 
B/~.) It can be shown that this loop cannot be shrunk continuously to a point 
in B/~. This, then, shows that SO(3) is not simply connected. In fact, B/~ 
is homeomorphic to the manifold RP? (real projective space of dimension 3) 
which has fundamental group Z/2. 


1.6 Homomorphisms and Isomorphisms 


Definition 1.15. Let G and H be matriz Lie groups. A map ® from G to H 
is called a Lie group homomorphism if (1) ® is a group homomorphism 
and (2) ® is continuous. If, in addition, ® is one-to-one and onto and the 
inverse map ÐT! is continuous, then ® is called a Lie group isomorphism. 


The condition that ® be continuous should be regarded as a technicality, in 
that it is very difficult to give an example of a group homomorphism between 
two matrix Lie groups which is not continuous. In fact, if G = R and H = C*, 
then any group homomorphism from G to H which is even measurable (a very 
weak condition) must be continuous. (See Exercise 17 in Chapter 9 of Rudin 
(1987).) 

Note that the inverse of a Lie group isomorphism is continuous (by defi- 
nition) and a group homomorphism (by elementary group theory), and thus 
a Lie group isomorphism. If G and H are matrix Lie groups and there exists 
a Lie group isomorphism from G to H, then G and H are said to be iso- 
morphic, and we write G = H. Two matrix Lie groups which are isomorphic 
should be thought of as being essentially the same group. 

The simplest interesting example of a Lie group homomorphism is the 
determinant, which is a homomorphism of GL(n;C) into C*. Another simple 
example is the map ® : R > SO(2) given by 


5(6) = Ga) 


sinf cosé 
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This map is clearly continuous, and calculation (using standard trigonometric 
identities) shows that it is a homomorphism. (Compare Exercise 6.) 


1.6.1 Example: SU(2) and SO(3) 


A very important topic for us will be the relationship between the groups 
SU(2) and SO(3). This example is designed to show that SU(2) and SO(3) 
are almost (but not quite!) isomorphic. Specifically, there exists a Lie group 
homomorphism ® which maps SU(2) onto SO(3) and which is two-to-one. We 
now describe this map. 

Consider the space V of all 2 x 2 complex matrices which are self-adjoint 
(i.e., A* = A) and have trace zero. This is a three-dimensional real vector 
space with the following basis: 


01 0 i 1 0 
a= (To)s4=(o)i4= (05) 


We may define an inner product (Section B.6 of Appendix B) on V by the 
formula 


(A, B) = 5 trace(AB). 


(Except for the factor of 5, this is simply the restriction to V of the Hilbert- 
Schmidt inner product described in Section B.6.) Direct computation shows 
that {.A;, A2, A3} is an orthonormal basis for V. Having chosen an orthonor- 
mal basis for V, we can identify V with R?. 

Now, suppose that U is an element of SU(2) and A is an element of V, and 
consider UAU~!. Then (Section B.5), trace(UAU~') = trace(A) = 0 and 


(UAUT!)* SU AU = UAU}, 


and so UAU-! is again in V. Furthermore, for a fixed U, the map A > UAU T! 
is linear in A. Thus for each U € SU(2), we can define a linear map ®y of V 
to itself by the formula 

y(A) = UAU™!. 


Note that U;U2AU, ‘Uy! = (U,;U2)A(U,U2)~1, and so y,u, = ®u, Pu. 
Moreover, given U € SU(2) and A, B € V, we have 


(®y(A), by (B)) = 5 trace(UAU-!UBU~ y= 5 trace(AB) = (A,B). 


Thus, ®y is an orthogonal transformation of V. 

Once we identify V with R? (using the above orthonormal basis), then we 
may think of y as an element of O(3). Since y, v, = Ëu, Bu,, we see that & 
(i.e., the map U —> ®y) is a homomorphism of SU(2) into O(3). It is easy to 
see that ® is continuous and, thus, a Lie group homomorphism. Recall that 
every element of O(3) has determinant +1. Now, SU(2) is connected (Exercise 
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8), ® is continuous, and ©; is equal to J, which has determinant one. It follows 
that ® must actually map SU(2) into the identity component of O(3), namely 
SO(3). 

The map U —> ®y is not one-to-one, since for any U € SU(2), 6y = O_v. 
(Observe that if U is in SU(2), then so is —U.) It is possible to show that 
®y is a two-to-one map of SU(2) onto SO(3). (The least obvious part of this 
assertion is that ® maps onto SO(3). This will be easy to prove once we have 
introduced the concept of the Lie algebra and proved Theorem 2.21.) The 
significance of this homomorphism is that SO(3) is not simply connected, but 
SU(2) is. The map © allows us to relate problems on the non-simply-connected 
group SO(3) to problems on the simply-connected group SU(2). 


1.7 The Polar Decomposition for SL(n;R) and SL(n; C) 


In this section, we consider the polar decompositions for SL(n; R) and SL(n; C). 
These decompositions can be used to prove the connectedness of SL(n; R) and 
SL(n;C) and to show that the fundamental groups of SL (n; R) and SL (n; C) 
are the same as those of SO(n) and SU(n), respectively (Appendix E). These 
decompositions are supposed to be analogous to the unique decomposition of 
a nonzero complex number z as z = up, with |u| = 1 and p real and positive. 
A real symmetric matrix P is said to be positive if (x, Px) > 0 for all 
nonzero vectors x € R”. (Symmetric means that P* = P.) Equivalently, a 
symmetric matrix is positive if all of its eigenvalues are positive. Given a 
symmetric positive matrix P, there exists an orthogonal matrix R such that 


P= RDR}, 
where D is diagonal with positive diagonal entries \1,..., An. (If we choose 
an orthonormal basis v1,...,Un of eigenvectors for P, then R is the matrix 
whose columns are vj,...,Un.) We can then construct a square root of P as 


p1/2 = RY PR, 


where D1/? is the diagonal matrix whose (positive) diagonal entries are 
di! A 7p A: 2 Then, P!/? is also symmetric and positive. It can be shown 
that P'/? is the unique positive symmetric matrix whose square is P (Exer- 
cise 21). 

We now prove the following result. 


Proposition 1.16. Given A in SL(n;R), there exists a unique pair (R, P) 
such that R € SO(n), P is real, symmetric, and positive, and A = RP. The 
matrix P satisfies det P = 1. 


Proof. If there were such a pair, then we would have At A = PR7!RP = P?. 
Now, A‘ A is symmetric (check!) and positive, since (z, A Ax) = (Az, Ar) > 
0, where Ax # 0 because A is invertible. Let us then define P by 
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P= (APA: 


so that P is real, symmetric, and positive. Since we want A = RP, we must 
set R= APT! = A((A‘A)*/?)-!. We check that R is orthogonal: 


RR! = AAP A) (At Ay Ae 
= A(A" A) A" =I. 


This shows that R is in O(n). To check that R is in SO(n), we note that 
1 = det A = det R det P. Since P is positive, we have det P > 0. This means 
that we cannot have det R = —1, so we must have det R = 1. It follows that 
det P = 1 as well. 

We have now established the existence of a pair (R, P) with the desired 
properties. To establish the uniqueness of the pair, we recall that if such a 
pair exists, then we must have P? = At" A. However, we have remarked earlier 
that a real, positive, symmetric matrix has a unique real, positive, symmetric 
square root, so P is unique. It follows that R = AP! is also unique. o 


If P is a self-adjoint complex matrix (i.e., P* = P), then we say P is 
positive if (x, Px) > 0 for all nonzero vectors x in C”. An argument similar 
to the one above establishes the following polar decomposition for SL(n; C). 


Proposition 1.17. Given A in SL(n;C), there exists a unique pair (U, P) 
with U € SU(n), P self-adjoint and positive, and A = UP. The matriz P 
satisfies det P = 1. 


It is left to the reader to work out the appropriate polar decompositions 
for the groups GL(n; R), GL(n;R)*, and GL(n; C). 


1.8 Lie Groups 


As explained in this section and in Appendix C, a Lie group is something that 
is simultaneously a smooth manifold and a group. As the terminology suggests, 
every matrix Lie group is a Lie group. (This is not at all obvious from the 
definition of a matrix Lie group, but it is true nevertheless, as we will prove in 
the next chapter.) The reverse is not true: Not every Lie group is isomorphic 
to a matrix Lie group. Nevertheless, I have restricted attention in this book to 
matrix Lie groups for several reasons. First, not everyone who wants to learn 
about Lie groups is familiar with manifold theory. Second, even for someone 
familiar with manifolds, the definitions of the Lie algebra and exponential 
mapping for a general Lie group are substantially more complicated and ab- 
stract than in the matrix case. Third, most of the interesting examples of Lie 
groups are matrix Lie groups. Fourth, the results we will prove for matrix Lie 
groups (e.g., about the relationship between Lie group homomorphisms and 
Lie algebra homomorphisms) continue to hold for general Lie groups. Indeed, 
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the proofs of these results are much the same as in the general case, except 
that one can get started more quickly in the matrix case. Although in the 
long run the manifold approach to Lie groups is unquestionably the right one, 
the matrix approach allows one to get into the meat of Lie group theory with 
minimal preparation. 

This section gives a very brief account of the manifold approach to Lie 
groups. Appendix C gives more information, and complete accounts can be 
found in standard textbooks such as those by Brécker and tom Dieck (1985), 
Varadarajan (1974), and Warner (1983). Appendix C gives two examples of 
Lie groups that cannot be represented as matrix Lie groups and also discusses 
two important constructions (covering groups and quotient groups) which can 
be performed for general Lie groups but not for matrix Lie groups. 


Definition 1.18. A Lie group is a differentiable manifold G which is also a 
group and such that the group product 


GxGoG 


1 


and the inverse map g > g™` are differentiable. 


A manifold is an object that looks locally like a piece of R”. An example 
would be a torus, the two-dimensional surface of a “doughnut” in R3, which 
looks locally (but not globally) like R°. For a precise definition, see Appendix 
C. 


Example. As an example, let 
G=RxRx S! = {(x,y,u)r ER yeRueS' cc} 
and define the group product G x G > G by 


(£1, yi, U1) + (£2, Yo, U2) = (£1 + T2, Y1 + yo, CY? U1 U2). 


Let us first check that this operation makes G into a group. It is not obvious 
but easily checked that this operation is associative; the product of three 
elements with either grouping is 


(£1 + £2 +23, y1 + y2 + ys, e92 trivs +7295) 4, uug). 


There is an identity element in G, namely e = (0,0,1) and each element 
(x, y,u) has an inverse given by (—2, —y,e’*¥u7!). 

Thus, G is, in fact, a group. Furthermore, both the group product and the 
map that sends each element to its inverse are clearly smooth, and so G is 
a Lie group. Note that there is nothing about matrices in the way we have 
defined G; that is, G is not given to us as a matrix group. We may still ask 
whether G is isomorphic to some matrix Lie group, but even this is not true. 
As shown in Appendix C, there is no continuous, injective homomorphism of 
G into any GL(n;C). Thus, this example shows that not every Lie group is 
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a matrix Lie group. Nevertheless, G is closely related to a matrix Lie group, 
namely the Heisenberg group. The reader is invited to try to work out what 
the relationship is before consulting the appendix. 


Now let us think about the question of. whether every matrix Lie group 
is a Lie group. This is certainly not obvious, since nothing in our definition 
of a matrix Lie group says anything about its being a manifold. (Indeed, the 
whole point of considering matrix Lie groups is that one can define and study 
them without having to go through manifold theory first!) Nevertheless, it is 
true that every matrix Lie group is a Lie group, and it would be a particularly 
misleading choice of terminology if this were not so. 


Theorem 1.19. Every matrix Lie group is a smooth embedded submanifold 
of M,(C) and is thus a Lie group. 


The proof of this theorem makes use of the notion of the Lie algebra of a 
matrix Lie group and is given in Chapter 2. Let us think first about the case 
of GL(n; C). This is an open subset of the space M,,(C) and thus a manifold 
of (real) dimension 2n?. The matrix product is certainly a smooth map of 
M,,(C) to itself, and the map that sends a matrix to its inverse is smooth 
on GL(n;C), by the formula for the inverse in terms of the classical adjoint. 
Thus, GL(n;(C) itself is a Lie group. If G C GL(n;C) is a matrix Lie group, 
then we will prove in Chapter 2 that G is a smooth embedded submanifold 
of GL(n;C). (See Corollary 2.33 to Theorem 2.27.) The matrix product and 
inverse will be restrictions of smooth maps to smooth submanifolds and, thus, 
will be smooth. This will show, then, that G is also a Lie group. 

It is customary to call a map ® between two Lie groups a Lie group 
homomorphism if ® is a group homomorphism and ® is smooth, whereas 
we have (in Definition 1.15) required only that ® be continuous. However, 
the following proposition shows that our definition is equivalent to the more 
standard one. 


Proposition 1.20. Let G and H be Lie groups and let ® be a group homo- 
morphism from G to H. If ® is continuous, it is also smooth. 


Thus, group homomorphisms from G to H come in only two varieties: the 
very bad ones (discontinuous) and the very good ones (smooth). There simply 
are not any intermediate ones. (See, for example, Exercise 19.) We will prove 
this in the next chapter (for the case of matrix Lie groups). See Corollary 2.34 
to Theorem 2.27. 

In light of Theorem 1.19, every matrix Lie group is a (smooth) manifold. 
As such, a matrix Lie group is automatically locally path-connected. It follows 
that a matrix Lie group is path-connected if and only if it is connected. (See 
the remarks following Definition 1.7.) 
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1.9 Exercises 


1. Let a be an irrational real number and let G be the following subgroup of 


GL(2; C): A 
cof (i Ae} 


oa it 
e={(4 ge) [tser}, 


where G denotes the closure of the set G inside the space of 2 x 2 matrices. 
Assume the following result: The set of numbers of the form e?7"*, n € Z, 
is dense in St. 

Note: The group G can be thought of as the torus S1 x S1, which, in 
turn, can be thought of as [0,27] x [0,27], with the ends of the inter- 
vals identified. The set G C [0,27] x [0, 27] is called an irrational line. 
Drawing a picture of this set should make it plausible that G is dense in 
(0, 27] x [0, 27]. 

2. Orthogonal groups. Let (-,-) denote the standard inner product on R”: 
(z,y) = Xp trys. Show that a matrix A preserves this inner product if 
and only if the column vectors of A are orthonormal. 

Show that for any n x n real matrix B, 


(Be, y) = (x, By), 


where (B'"),, = Bix. Using this, show that a matrix A preserves the inner 
product on R” if and only if A‘ A = I. 

Note: A similar analysis applies to the complex orthogonal groups O(n; C) 
and SO(n; C). 

3. Unitary groups. Let (-,-) denote the standard inner product on C”: 
(x,y) = } p Fey. Following Exercise 2, show that (Ax, Ay) = (x,y) for 
all x,y € C” if and only if A*A = I and that this holds if and only if the 
columns of A are orthonormal. Here, (A*),, = Aik. 

4. Generalized orthogonal groups. Let [-,-|n,~ be the symmetric bilinear form 
on R”+* defined in (1.1). Let g be the (n + k) x (n +k) diagonal matrix 
with first n diagonal entries equal to one and last k diagonal entries equal 


to minus one: 
{In 9 
aia 0k] 


Show that for all z, y € R"+*, 


Show that 


[z lnc = (29) - 


Show that a (n + k) x (n +k) real matrix A is in O(n; k) if and only if 
A* gA = g. Show that O(n; k) and SO(n; k) are subgroups of GL(n +k; R) 
and are matrix Lie groups. 
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Symplectic groups. Let B|x, y] be the skew-symmetric bilinear form on 
R?” given by Blz,y] = P p—1(2kYn+k — Zn+kYyk). Let J be the 2n x 2n 


matrix 
Ol 
EN. 


Biz, y] = (x, Jy) 


Show that for all z,y € R?”, 


Show that a 2n x 2n matrix A is in Sp(n;R) if and only if A‘ JA = J. 
Show that Sp(n; R) is a subgroup of GL(2n;R) and a matrix Lie group. 
Note: A similar analysis applies to Sp(n; C). 


. The groups O(2) and SO(2). Show that the matrix 


cos — sin 0 
sin@ cosé 
is in SO(2) and that 


cos@ —sin@\ (cos¢—sing\ _ ( cos(8+¢) —sin(@+ ¢) 
sinf cos6 sing cosg)  \sin(@+¢) cos(@+¢) /)” 


Show that every element A of O(2) is of one of the two forms: 
cos — sin 0 cos sin 
as ey a) as Ge ee) i 
(Note that if A is of the first form, then det A = 1, and if A is of the 
second form, then det A = —1.) 


Hint: Recall that for A to be in O(2), the columns of A must be orthonor- 
mal. 


. The groups O(1;1) and SO(1;1). Show that the matrix 


cosht sinht 
sinht cosht 
is in SO(1;1) and that 


Ge nae) ee a) s e + s) sinh(t + s) ) 


sinht cosht / \ sinh s cosh s sinh(t + s) cosh(t + s) 


Show that every element of O(1; 1) can be written in one of the four forms: 
cosht sinht \ | —cosht sinh \ | 
sinht cosht J ’ sinht —cosht j ° 


cosht —sinht \ | —cosht —sinht 
sinht —cosht /’ sinht cosht)~ 


10. 


11. 


12. 


13. 
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(Note that since cosht is always positive, there is no overlap among the 
four cases. Note also that matrices of the first two forms have determinant 
one and matrices of the last two forms have determinant minus one.) 
Hint: Use condition (1.2). 


. The group SU(2). Show that if a and @ are arbitrary complex numbers 


satisfying |a|? + |8|? = 1, then the matrix 


(59 


is in SU(2). Show that every A € SU(2) can be expressed in this form 
for a unique pair (a, 3) satisfying |a|? + |8|? = 1. (Thus, SU(2) can be 
thought of as the three-dimensional sphere $° sitting inside C? = Rt. In 
particular, this shows that SU(2) is simply connected.) 


. The groups Sp(1;R), Sp(1;C), and Sp(1). Show that Sp(1;R) = SL(2; R), 


Sp(1; C) = SL(2;C), and Sp(1) = SU(2). 
The Heisenberg group. Determine the center Z(H) of the Heisenberg group 
H. Show that the quotient group H/Z(H) is abelian. 
A subset E of a matrix Lie group G is called discrete if for each A in E 
there is a neighborhood U of A in G such that U contains no point in Æ 
except for A. Suppose that G is a connected matrix Lie group and N is 
a discrete normal subgroup of G. Show that N is contained in the center 
of G. 
This problem gives an alternative proof of Proposition 1.9, namely that 
GL(n;C) is connected. Suppose A and B are invertible n x n matrices. 
Show that there are only finitely many complex numbers for which 
det (AA + (1 — \)B) = 0. Show that there exists a continuous path A(t) 
of the form A(t) = A(t)A + (1 — A(t))B connecting A to B and such that 
A(t) lies in GL(n;C). Here, A(t) is a continuous path in the plane with 
A(0) = 0 and A(1) = 1. 
Connectedness of SO(n). Show that SO(n) is connected, using the follow- 
ing outline. 
For the case n = 1, there is nothing to show, since a 1 x 1 matrix with 
determinant one must be [1]. Assume, then, that n > 2. Let eı denote the 
unit vector with entries 1,0,...,0 in R”. Given any unit vector v € R”, 
show that there exists a continuous path R(t) in SO(n) with R(0) = I 
and R(1)v = e1. (Thus, any unit vector can be “continuously rotated” to 
e1.) 
Now, show that any element R of SO(n) can be connected to a block- 
diagonal matrix of the form 
1 
(a) 


with Rı € SO(n — 1) and proceed by induction. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


1 Matrix Lie Groups 


The connectedness of SL(n;R). Using the polar decomposition of SL(n; R) 
(Proposition 1.16) and the connectedness of SO(n) (Exercise 13), show 
that SL(n;R) is connected. 

Hint: Recall that if P is a real, symmetric matrix, then there exists a real, 
orthogonal matrix Rı such that P = R,DR;', where D is diagonal. 

The connectedness of GL(n; R)t. Using the connectedness of SL(n; R) (Ex- 
ercise 14) show that GL(n; R)* is connected. 

If R is an element of SO(3), show that R must have an eigenvector with 
eigenvalue 1. 

Hint: Since SO(3) c SU(3), every (real or complex) eigenvalue of R must 
have absolute value 1. 

Show that the set of translations is a normal subgroup of the Euclidean 
group E(n). Show that the quotient group E(n)/(translations) is isomor- 
phic to O(n). (Assume Proposition 1.5.) 

Let a be an irrational real number. Show that the set of numbers of the 
form e2"'"*, n € Z, is dense in $1. (See Problem 1.) 

Show that every continuous homomorphism © from R to S! is of the form 
(x) = et? for some a € R. (This shows in particular that every such 
homomorphism is smooth.) 

Suppose G C GL(ni;C) and H C GL(n2;C) are matrix Lie groups and 
that ® : G > H is a Lie group homomorphism. Then, the image of G 
under ® is a subgroup of H and thus of GL(n2; C). Is the image of G under 
® necessarily a matrix Lie group? Prove or give a counter-example. 
Suppose P is a real, positive, symmetric matrix with determinant one. 
Show that there is a unique real, positive, symmetric matrix Q whose 
square is P. 

Hint: The existence of Q was discussed in Section 1.7. To prove uniqueness, 
consider two real, positive, symmetric square roots Qı and Q2 of P and 
show that the eigenspaces of both Q, and Qə coincide with the eigenspaces 
of P. 
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Lie Algebras and the Exponential Mapping 


2.1 The Matrix Exponential 


The exponential of a matrix plays a crucial role in the theory of Lie groups. 
The exponential enters into the definition of the Lie algebra of a matrix Lie 
group (Section 2.5) and is the mechanism for passing information from the Lie 
algebra to the Lie group. Since many computations are done much more easily 
at the level of the Lie algebra, the exponential is indispensable in studying 
(matrix) Lie groups. 

Let X be an nxn real or complex matrix. We wish to define the exponential 
of X, denoted e* or exp X, by the usual power series 


e* = 5 zai (2.1) 


We will follow the convention of using letters such as X and Y for the variable 
in the matrix exponential. 


Proposition 2.1. For any n x n real or complex matriz X, the series (2.1) 
converges. The matrix exponential e* is a continuous function of X. 


Before proving this, let us review some elementary analysis. Recall that 
the norm of a vector x = (£1,..., £n) in C” is defined to be 


: 1/2 
z| = V(x, 2) = (>: in?) 
k=1 


We now define the norm of a matrix by thinking of the space M,(C) of all 
n x n matrices as C””. This means that we define 


1/2 
n 


IXI={ So Xa) (2.2) 


k,l=1 
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This norm satisfies the inequalities 


IX +Y|| < IXI + IYI, (2.3) 
IXYI < IXY (2.4) 


for all X,Y € M,(C). The first of these inequalities is the triangle inequal- 
ity and is a standard result from elementary analysis. The second of these 
inequalities follows from the Schwarz inequality (Exercise 1). If Xm is a se- 
quence of matrices, then it is easy to see that Xm converges to a matrix X in 
the sense of Definition 1.3 if and only if | Xm — X || — 0 as m — ov. 

The norm (2.2) is called the Hilbert-Schmidt norm. There is another 
commonly used norm on the space of matrices, called the operator norm, 
whose definition is not relevant to us. It is easily shown that convergence in 
the Hilbert-Schmidt norm is equivalent to convergence in the operator norm. 
(This is true because we work with linear operators on the finite-dimensional 
space C”.) Furthermore, the operator norm also satisfies (2.3) and (2.4). Thus, 
it matters little whether we use the operator norm or the Hilbert—Schmidt 
norm. 

A sequence Xm of matrices is said to be a Cauchy sequence if 


[Xm — X1|| + 0 


as m,l — oo. Thinking of the space M,,(C) of matrices as Cc” and using a 
standard result from analysis, we have the following. 


Proposition 2.2. If Xm is a Cauchy sequence in M,,(C), then there exists a 
unique matriz X such that Xm converges to X. 


That is, every Cauchy sequence in M,,(C) converges. 
Now, consider an infinite series whose terms are matrices: 


Xo+ Xi +X +.. (2.5) 
If m 
X |Xmll < œ, 
m=0 


then the series (2.5) is said to converge absolutely. If a series converges 
absolutely, then it is not hard to show that the partial sums of the series form 
a Cauchy sequence, and, hence, by Proposition 2.2, the series converges. That 
is, any series which converges absolutely also converges. (The converse is not 
true; a series of matrices can converge without converging absolutely.) 

We now turn to the proof of Proposition 2.1. 


Proof. In light of (2.4), we see that 


|x| < XI”, 
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and, hence, 


co [e0] m 
D xml < 5o ZIL = exi < o. 
=| mt | ~ 4m! 


Thus, the series (2.1) converges absolutely, and so it converges. 

To show continuity, note that since X™ is a continuous function of X, 
the partial sums of (2.1) are continuous. However, it is easy to see that (2.1) 
converges uniformly on each set of the form {||X|| < R}, and so the sum is, 
again, continuous. Oo 


We now list some elementary properties of the matrix exponential. 


Proposition 2.3. Let X and Y be arbitrary n x n matrices. Then, we have 
the following: 


1.2 =]. 

2. (eX) =e". 

3. e” is invertible and (e xi ze, 

4. &tB)X — e°XebX for alla and B inC. 

5. If XY =YX, then siete = ee = =eYe*, 
6. If C is PREE T then e€X°° = Ce¥ Cmt. 
7, Ije* || < elXll, 


It is not true in general that eX tY = eX eY , although, by Point 4, it is true 


if X and Y commute. This is a crucial point, shih we will consider in detail 
later. (See the Lie product formula in Section 2.4 and the Baker-Campbell- 
Hausdorff formula in Chapter 3.) 


Proof. Point 1 is obvious and Point 2 follows from taking term-by-term ad- 
joints of the series for e*. Points 3 and 4 are special cases of Point 5. To verify 
Point 5, we simply multiply the power series term by term. (It is left to the 
reader to verify that this is legal.) Thus, 


X y? 
e” eY = (rxs + Jri) 


Multiplying this out and collecting terms where the power of X plus the power 
of Y equals m, we get 


Now, because (and only because) X and Y commute, 


(X+Y)™= aT Ji XEYE, 
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and, thus, (2.6) becomes 


aa | 

e” = X (X +Y)” se, 
m 

m=0 


To prove Point 6, simply note that 
(CAO = OKC 


and, thus, the two sides of Point 6 are equal term by term. 
Point 7 is evident from the proof of Proposition 2.1. o 


tX 


Proposition 2.4. Let X be an xn complex matriz. Then, e** is a smooth 


curve in Mn (C) and 
— etx = Xex Ta eX x. 
dt 

In particular, 


d hio 


Proof. Differentiate the power series for e*¥ term by term. (This is permitted 
because, for each 7 and j, (Coa is given by a convergent power series in t, 
and it is a standard theorem that one can differentiate power series term by 


term.) o 


2.2 Computing the Exponential of a Matrix 


We consider here methods for exponentiating general matrices. A special 
method for exponentiating 2 x 2 matrices is described in Exercises 6 and 
7. 


2.2.1 Case 1: X is diagonalizable 


Suppose that X is an nxn real or complex matrix and that X is diagonalizable 
over C; that is, there exists an invertible complex matrix C such that X = 
CDC, with 


A 0 
D= ; 
0 Àn 
It is easily verified that e? is the diagonal matrix with eigenvalues eò, ... , eò”, 


and so in light of Proposition 2.3, we have 
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Thus, if we can explicitly diagonalize X, we can explicitly compute e*. Note 
that if X is real, then although C may be complex and the A,%’s may be 
complex, e* must come out to be real, since each term in the series (2.1) is 


real. 
0 -a 
zel a 


For example, take 
Then, the eigenvectors of X are ( a and e with eigenvalues —ia and ia, 
respectively. Thus, the invertible matrix 


o= (ži) 


maps the basis vectors (a and G to the eigenvectors of X, and so (check) 


C-!XC is a diagonal matrix D. Thus, X = CDC™~! and 


ET 
(co) an 


Note that explicitly if X (and hence a) is real, then e* is real. See Exercise 6 
for an alternative method of calculation. 


2.2.2 Case 2: X is nilpotent 


An n x n matrix X is said to be nilpotent if X™ = 0 for some positive 
integer m. Of course, if X™ = 0, then X! = 0 for all | > m. In this case, the 
series (2.1), which defines e* , terminates after the first m terms, and so can 
be computed explicitly. 

For example, let us compute e* , where 


Oab 
X=|00c 
000 


Note that 
00 ac 


xX?={[000 
000 


and that X = 0. Thus, 
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2.2.3 Case 3: X arbitrary 


A general matrix X may be neither nilpotent nor diagonalizable. However, 
by Theorem B.6, every matrix X can be written (uniquely) in the form X = 
S + N, with S diagonalizable, N nilpotent, and SN = NS. Then, since N 
and S commute, 


and e° and e™ can be computed as in the two previous subsections. 
For example, take 


Then, 


The two terms clearly commute (since the first one is a multiple of the iden- 


tity), and, so, 
X= e* 0 1b\ _ (feeb 
~ \ 0 e 01) \O0 ej 


2.3 The Matrix Logarithm 


We wish to define a matrix logarithm, which should be an inverse function (to 
the extent possible) to the matrix exponential. Let us recall the situation for 
the logarithm of complex numbers, in order to see what is reasonable to expect 
in the matrix case. Since e” is never zero, only nonzero numbers can have a 
logarithm. Every nonzero complex number can be written as e” for some z, 
but the z is not unique. There is no continuous way to define the logarithm on 
the set of all nonzero complex numbers. The situation for matrices is similar. 
For any X € M,,(C), e* is invertible; therefore, only invertible matrices can 
possibly have a logarithm. We will see (Theorem 2.9) that every invertible 
matrix can be written as e*, for some X € M,C). However, the X is not 
unique and there is no continuous way to define the matrix logarithm on the 
set of all invertible matrices. 

The simplest way to define the matrix logarithm is by a power series. We 
recall how this works in the complex case. 


Lemma 2.5. The function 


ies ye 2.8 
ies y alae (2.8) 
is defined and analytic in a circle of radius 1 about z = 1. 
For all z with |z- 1| < 1, 
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Hoge y, 

For all u with |u| < log 2, Je“ — 1| < 1 and 
loge’ = 


Proof. The usual logarithm for real, positive numbers satisfies 


d —1 
— Jos(1 — 7) = —— = — ETA 
pista) =e s-0tete t) 


for |z| < 1. Integrating term by term and noting that log 1 = 0 gives 


2 3 
log(1 — x) = (2+3 Te $e). 
Taking z = 1 — z (so that x = 1 — z), we have 


C) 


2 3 


aS ymsa ( =E , 


m=1 


log z = — (a z)+ 


This series has radius of convergence 1 and defines a complex analytic 
function on the set {|z — 1| < 1}, which coincides with the usual logarithm 
for real z in the interval (0,2). Now, exp(logz) = z for z € (0,2), and by 
analyticity, this identity continues to hold on the whole set {|z — 1| < 1}. 
(That is to say, the functions z > exp(logz) and z — z are both complex 
analytic functions and they agree on the interval (0,2); therefore they must 
agree on the whole disk {|z — 1| < 1}.) 

On the other hand, if |u| < log 2, then 


jul? 


ew —1|= ute pes < jul + = +e [s el. 
2! 21 


Thus, log(exp u) makes sense for all such u. Since log(expu) = u for real u 
with |u| < log2, it follows by analyticity that log(exp u) = u for all complex 
numbers with |u| < log 2. m 


Definition 2.6. For any n x n matrix A, define log A by 


log A = Èe i e (2.9) 


whenever the series converges. 


Since the complex-valued series (2.8) has radius of convergence 1 and 
since ||(A — I)®]|| < |A -— I|” , the matrix-valued series (2.9) will converge 
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if ||A —I|| < 1. However, in contrast to the complex-valued case, the series 
(2.9) may converge even if ||A —J|| > 1, since ||(A —J)™|| may be strictly 
smaller than ||A — J||". For example, if A — J is nilpotent, then (2.9) ter- 
minates and, thus, converges. (See Exercise 8.) Nevertheless, we will mostly 
content ourselves with considering the case ||A — I|| < 1. 


Theorem 2.7. The function 


(A- m 
log A = 2a penl zD" 


m 


is defined and continuous on the set of all n x n complex matrices A with 
|4- I| <1. 
For all A with |A -—I|| <1, 


e84 — A, 
For all X with ||X|| < log 2, lļe*¥ — T| <1 and 
loge* = X. 


Proof. Since ||(A — I)™|| < ||(A — I) ||” and since the series (2.8) has radius of 
convergence 1, the series (2.9) converges absolutely for all A with ||A — J|| < 1. 
The proof of continuity is essentially the same as for the exponential. 

We will now show that exp(log A) = A for all A with ||A — J|| < 1. We do 
this by considering two cases. 

Case 1: A is diagonalizable. 

Suppose that A = CDC™!, with D diagonal. Then, A— I = CDC71!—- 
C(D —I)C~". It follows that (A — J)” is of the form 


(z1 = 1)™ 0 
(A= = ties G 
0 (zn — 1)” 


where z21,...,Zn are the eigenvalues of A. 
Now, if ||A—J|| < 1, then it is not hard to show (Exercise 2) that each 
eigenvalue zx of A must satisfy |z, — 1| < 1. Thus, 


log 21 0 


2S yer ( (Ae E >g 7 c7, 
mal 0 log zn 
and by Lemma 2.5, 
clog 21 0 
log A _ C one 


2.4 Further Properties of the Matrix Exponential 35. 


Case 2: A is not diagonalizable. 

If A is not diagonalizable, then, using Theorem B.7, it is not difficult 
to construct a sequence Am of diagonalizable matrices with Am — A. (See 
Exercise 5.) If |A -— I|| < 1, then ||Am —J|| < 1 for all sufficiently large m. 
By Case 1, exp(log Am) = Am, and, so, by the continuity of exp and log, 
exp(log A) = A. 

Thus, we have shown that exp(log A) = A for all A with |A- || < 1. 
Now, the same argument as in the complex case shows that if ||X|| < log 2, 
then |e” -I I| < 1. The same two-case argument shows that log(exp X) = X 
for all such X. o 


Proposition 2.8. There exists a constant c such that for all n x n matrices 
B with ||B|| < 4, 
llog( + B) - B|] < c||B|? . 


Proof. Note that 


2 Bm oo Bm-2 
log(I + B)-B= Dan = B? S = 
m=2 m=2 


so that 


2 ayn 
2 
[log(7 + B) — BI < IBI? X >. 
m=2 
This is what we want. (It is easily verified that the sum in the last expression 


is convergent.) o 
We may restate the proposition in a more concise way by saying that 
log(I + B) = B + O(||B|Î), 


where o(BI® denotes a quantity of order \| BI? (i.e., a quantity that is 

bounded by a constant times IBI? for all sufficiently small values of ||B||). 
We conclude this section with a result that, although we will not use it 

elsewhere, is worth recording. The proof is sketched in Exercises 8 and 9. 


Theorem 2.9. Every invertible nxn matrix can be expressed as e* for some 
X €M,(C). 


2.4 Further Properties of the Matrix Exponential 


In this section, we give several additional results involving the exponential of 
a matrix that will be important in our study of Lie algebras. 
Theorem 2.10 (Lie Product Formula). Let X and Y be n x n complex 
matrices. Then, 
t en YON 
e = lim (cer ) ; 


m> o0 
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This theorem has a big brother, called the Trotter product formula, which 
gives the same result in the case where X and Y are suitable unbounded op- 
erators on an infinite-dimensional Hilbert space. The Trotter product formula 
is described, for example, in Reed and Simon (1980), Section VIII.8. 


Proof. If we multiply the power series for em and em, all but three of the 
terms will involve 1/m? or higher powers of 1/m. Thus, 


Now, since emem +I asm oO, emem is in the domain of the logarithm 
for all sufficiently large m. By Proposition 2.8, 
) 


log (e = en) = lon (24242 40(S z :)) 
vient (lnm? Ce) 


Exponentiating the logarithm then gives 


x Y X Y 1 

emem =exp| — +—+0| — 

m m m 

x ¥\m 1 
(cm en ) =exp| X+Y+O[(—)]. 

m 


Thus, by the continuity of the exponential, we conclude that 


and, therefore , 


lim Gia =exp(X +Y), 


m—- oo 
which is the Lie product formula. Oo 


Recall (Section B.5) that the trace of a matrix is defined as the sum of its 
diagonal entries and that similar matrices have the same trace. 


Theorem 2.11. For any X € M,(C), we have 
det (e*) = etrace(X) | 


Proof. There are three cases, as in Section 2.2. 
Case 1: X is diagonalizable. Suppose there is a complex invertible matrix 
C such that 
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Ài 0 
X=C , Gr: 
0 Àn 
Then, 
e^ 0 
e¥ =C 7 c! 
0 ern 


Thus, trace(X) = > A; and det(e*) = [J e™ = e&*. 
Case 2: X is nilpotent. If X is nilpotent, then by Theorem B.7, there is 
an invertible matrix C such that 


In that case (it is easy to see), e* will be upper triangular, with ones on the 
diagonal: 


Thus, if X is nilpotent, trace(X) = 0 and det(e*) = 1. 

Case 3: X is arbitrary. As pointed out in Section 2.2, every matrix X 
can be written as the sum of two commuting matrices S and N, with S 
diagonalizable (over C) and N nilpotent. Since S and N commute, e* = eS e. 
So, by the two previous cases, 


det (e*) = det (e£) det (e™) = ettace(S) otrace( N) = etrace(X) 


which is what we want. (Note that trace(N) = 0 and trace(S) = trace(X).) 
Oo 


Definition 2.12. A function A : R + GL(n;C) is called a one-parameter 
subgroup of GL(n;C) if 


1. A is continuous, 
2. A(0) =I, 
3. A(t + s) = A(t)A(s) for all t,s E R. 


Theorem 2.13 (One-Parameter Subgroups). If A is a. one-parameter 
subgroup of GL(n;C), then there exists a unique n x n complex matrix X 
such that 

Alt) = &*. 
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By taking n = 1, and noting that GL(1;C) = C*, this theorem provides a 
method of solving Exercise 19 in Chapter 1. 


Proof. The uniqueness is immediate, since if there is such an X, then X = 
4 A(t)| =o: 50, we need only worry about existence. 

Let B, be the open ball of radius € about zero in M,,(C); that is, Be = 
{X € M,,(C)| ||X|| < €}. Assume that € < log2. Then, we have shown that 
“exp” takes B, injectively into M,,(C), with continuous inverse that we denote 
“log.” Now, let U = exp(B./2), which is an open set in GL(n; C). 


Lemma 2.14. Every g € U has a unique square root h in U, given by h = 
exp( $ log g). 


Proof. Let X = logg. Then, h = exp(X/2) is a square root of g, since 
h? = exp(X) = g. Suppose h’ € U satisfies (h’)? = g. Let Y = logh’; 
then, exp(Y) = h’ and exp(2Y) = (h’)? = g = exp(X). We have that 
Y € B,/2 and, thus, 2Y € B., and also that X € B./2 C Be. Since exp 
is injective on Be and exp(2Y) = exp(X), we must have 2Y = X. Thus, 
h’ = exp(Y) = exp(X/2) = h. This shows the uniqueness of the square root 
in U. oO 


Returning to the proof of Theorem 2.13, the continuity of A guarantees 
that there exists tọ > 0 such that A(t) € U for all t with |t| < to. Then, 
let X = į log(A(to)), so that toX = log(A(to)). Then, toX € Bez and 
A(to) = exp(toX). Then, A(to/2) is in U and A(to/2)? = A(to). By the 
lemma, A(tp) has a unique square root in U, and that unique square root is 
exp(to.X/2). So, we must have A(to/2) = exp(toX/2). 

Applying this argument repeatedly, we conclude that 


A(to/2*) = exp(to.X/2*) 


for all positive integers k. Then, for any integer m, we have A(mto/2*) = 
A(to/2*)™ = exp(mtyX/2*). This means that A(t) = exp(tX) for all real 
numbers t of the form t = mto/2*, and the set of such t’s is dense in R. Since 
both exp(tX) and A(t) are continuous, it follows that A(t) = exp(tX) for all 
real numbers t. o 


2.5 The Lie Algebra of a Matrix Lie Group 


The Lie algebra is an indispensable tool in studying matrix Lie groups. On the 
one hand, Lie algebras are simpler than matrix Lie groups, because (as we will 
see) the Lie algebra is a linear space. Thus, we can understand much about 
Lie algebras just by doing linear algebra. On the other hand, the Lie algebra 
of a matrix Lie group contains much information about that group. (See, for 
example, Theorem 2.27 in Section 2.7, and the Baker-Campbell-Hausdorff 
Formula (Chapter 3).) Thus, many questions about matrix Lie groups can be 
answered by considering a similar but easier problem for the Lie algebra. 
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Definition 2.15. Let G be a matrix Lie group. The Lie algebra of G, de- 
noted g, is the set of all matrices X such that eX is in G for all real numbers 
t. 


This means that X is in g if and only if the one-parameter subgroup 
generated by X lies in G. Note that even though G is a subgroup of GL(n; C) 
(and not necessarily of GL(n;R)), we do not require that e’* be in G for all 
complex numbers t, but only for all real numbers t. Also, it is definitely not 
enough to have just e* in G. That is, it is easy to give an example of an X 
and a G such that eX € G but such that e’* ¢ G for some real values of t 
(Exercise 10). Such an X is not in the Lie algebra of G. 

There is an abstract notion of a Lie algebra (not necessarily associated to 
any group), which is described in Section 2.8. The results of Section 2.6 will 
show that g is, indeed, a Lie algebra in that sense. 

It is customary to use lowercase Gothic (Fraktur) characters such as g and 
b to refer to Lie algebras. 

We will show in Section 2.7 that every matrix Lie group is an embedded 
submanifold of GL(n;C). We will then show that g is the tangent space to 
G at the identity. See Corollary 2.35. This means that g can alternatively be 
defined as the set of all derivatives of smooth curves through the identity in 


G. 


2.5.1 Physicists’ Convention 


Physicists are accustomed to considering the map X —> e’* instead of X > 
e*. Thus, a physicist would think of the Lie algebra of G as the set of all 
matrices X such that e”* € G for all real numbers t. In the physics literature, 
the Lie algebra is frequently referred to as the space of “infinitesimal group 
elements.” The physics literature does not always distinguish clearly between 
a matrix Lie group and its Lie algebra. 

Before examining general properties of the Lie algebra, let us compute the 
Lie algebras of the matrix Lie groups introduced in the previous chapter. 


2.5.2 The general linear groups 


If X is any n x n complex matrix, then by Proposition 2.3, e’* is invertible. 
Thus, the Lie algebra of GL(n; C) is the space of all n x n complex matrices. 
This Lie algebra is denoted gl(n; C). 

If X is any n x n real matrix, then e’* will be invertible and real. On 
the other hand, if et¥ is real for all real numbers t, then X = Gtx | =o Will 
also be real. Thus, the Lie algebra of GL(n;R) is the space of all n x n real 
matrices, denoted gl(n; R). 

Note that the preceding argument shows that if G is a subgroup of 
GL(n;R), then the Lie algebra of G must consist entirely of real matrices. 
We will use this fact when appropriate in what follows. 
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Recall Theorem 2.11: det(e¥) = etace(X). Thus, if trace(X) = 0, then 
det(e'*) = 1 for all real numbers t. On the other hand, if X is any nxn 
matrix such that det(e’*) = 1 for all t, then e¢¢e(*) = 1 for all t. This 
means that ttrace(X) is an integer multiple of 27i for all t, which is only 
possible if trace(X) = 0. Thus, the Lie algebra of SL (n; C) is the space of all 
n x n complex matrices with trace zero, denoted sl(n; C). 

Similarly, the Lie algebra of SL (n; R) is the space of all n x n real matrices 
with trace zero, denoted sl (n; R). 


2.5.4 The unitary groups 


Recall that a matrix U is unitary if and only if U* = U~!. Thus, et*¥ is unitary 
if and only if 


(eX)* = (eX) = eX, (2.10) 
By Point 2 of Proposition 2.3, (et¥)*” = e'*”, and so (2.10) becomes 
ex" — et, (2.11) 


Clearly, a sufficient condition for (2.11) to hold is that X* = —X. On the 
other hand, if (2.11) holds for all t, then by differentiating at t = 0, we see 
that X* = —X is necessary. 

Thus, the Lie algebra of U(n) is the space of all n x n complex matrices 
X such that X* = —X, denoted u(n). 

By combining the two previous computations, we see that the Lie algebra 
of SU(n) is the space of all n x n complex matrices X such that X* = —X 
and trace(X) = 0, denoted su(n). 


2.5.5 The orthogonal groups 


The identity component of O(n) is just SO(n). Since (Proposition 2.16) the 
exponential of a matrix in the Lie algebra is automatically in the identity 
component, the Lie algebra of O(n) is the same as the Lie algebra of SO(n). 
Now, an n x n real matrix R is orthogonal if and only if Rt” = R~}. So, 
given an nxn real matrix X, e'* is orthogonal if and only if (eX)! = (e**)~!, 

or 
eX = etx, (2.12) 


Clearly, a sufficient condition for this to hold is that X" = —X. If (2.12) 
holds for all t, then by differentiating at t = 0, we must have X" = —X. 

Thus, the Lie algebra of O(n), as well as the Lie algebra of SO(n), is the 
space of all n x n real matrices X with X'" = —X, denoted so(n). Note that 
the condition Xt" = —X forces the diagonal entries of X to be zero, and, so, 
necessarily the trace of X is zero. 
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The same argument shows that the Lie algebra of SO(n; C) is the space of 
n x n complex matrices satisfying Xt" = —X, denoted so(n;C). This is not 
the same as su(7n). 


2.5.6 The generalized orthogonal groups 


A matrix A is in O(n; k) if and only if At” gA = g, where g is the (n+k) x (n+k) 
diagonal matrix with the first n diagonal entries equal to one and the last 
k diagonal entries equal to minus one. This condition is equivalent to the 
condition g~! Ag = A™}, or, since explicitly g~! = g, gA’’g = A7!. Now, if 
X is an (n + k) x (n + k) real matrix, then e’* is in O(n; k) if and only if 
ge” g = et9X'"9 2 etx, 

This condition holds for all real t if and only if gX‘"g = —X. Thus, the Lie 
algebra of O(n; k), which is the same as the Lie algebra of SO(n; k), consists 
of all (n + k) x (n + k) real matrices X with gX''g = —X. This Lie algebra 
is denoted so(n; k). 

(In general, the group SO(n; k) will not be connected, in contrast to the 
group SO(n). The identity component of SO(n; k), which is also the identity 
component of O(n; k), is denoted SO(n; k)e. The Lie algebra of SO(n; k)e is 
the same as the Lie algebra of SO(n; k).) 


2.5.7 The symplectic groups 


These are denoted sp(n; R), sp(n;C), and sp(n). The calculation of these Lie 
algebras is similar to that of the generalized orthogonal groups, and I will just 
record the result here. Let J be the matrix in the definition of the symplectic 
groups. Then, sp(n;R) is the space of 2n x 2n real matrices X such that 
JX*" J = X, sp(n;C) is the space of 2n x 2n complex matrices satisfying the 
same condition, and sp(n) = sp(n;C)Nu(2n). A simple calculation shows that 
the elements of sp(n; C) are precisely the 2n x 2n matrices of the form 


A B 
C —A' ’ 


where A is an arbitrary n x n matrix and B and C are arbitrary symmetric 
matrices. 


2.5.8 The Heisenberg group 


Recall that the Heisenberg group H is the group of all 3 x 3 real matrices A 


of the form 
lab 


A=[O0lc}], (2.13) 
001 
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with a,b,c € R. Recall also that in Section 2.2, Case 2, we computed the 
exponential of a matrix of the form 


X=[007 (2.14) 


and saw that e* was in H. On the other hand, if X is any matrix such that 
e' is of the form (2.13), then all of the entries of X = #e'*| =o Which are 
on or below the diagonal must be zero, so that X is of form (2.14). 

Thus, the Lie algebra of the Heisenberg group is the space of all 3 x 3 real 
matrices that are strictly upper triangular. 


2.5.9 The Euclidean and Poincaré groups 


Recall that the Euclidean group E(n) is (or can be thought of as) the group 
of (n+ 1) x (n+ 1) real matrices of the form 


with R € O(n). Now, if X is an (n +1) x (n+ 1) real matrix such that e’* is 
in E(n) for all t, then X = Zex | 9 Must be zero along the bottom row: 


X= Y : (2.15) 


Our goal, then, is to determine which matrices of the form (2.15) are 
actually in the Lie algebra of the Euclidean group. A simple computation 
shows that for n > 1, 


Yı 
Y : _ yn yrly 
Yn 
0- 0 0- 0 
where y is the column vector with entries y1,..., Yn. It follows that if X is as 


in (2.15), then e’* is of the form 
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Now, we have already established that e’Y is in O(n) for all t if and only 
if Y}! = —Y. Thus, we see that the Lie algebra of E(n) is the space of all 
(n+1) x (n+1) real matrices of the form (2.15) with Y satisfying Y'" = —Y. 

A similar argument shows that the Lie algebra of P(n;1) is the space of 
all (n + 2) x (n + 2) real matrices of the form 


with Y € so(n; 1). 


2.6 Properties of the Lie Algebra 


We will now establish various basic properties of the Lie algebra of a matrix 
Lie group. The reader is invited to verify by direct calculation that these 
general properties hold for the examples computed in the previous section. 


Proposition 2.16. Let G be a matrix Lie group, and X an element of its Lie 
algebra. Then, eX is an element of the identity component of G. 


Proof. By definition of the Lie algebra, e’* lies in G for all real t. However, 

as t varies from 0 to 1, e’* is a continuous path connecting the identity to 
X 

Crs o 


Proposition 2.17. Let G be a matrix Lie group, with Lie algebra g. Let X 
be an element of g, and A an element of G. Then, AX A`! is in g. 


Proof. This is immediate, since, by Proposition 2.3, 


et(AXA™*) = Ae’X Am} 


and, thus, Ae’* AT! € G for all t. o 


Theorem 2.18. Let G be a matrix Lie group, g its Lie algebra, and X and 
Y elements of g. Then 


1. sX € g for all real numbers s, 
2. X +Y €g, 
3. XY -YX €g. 


If one follows the physics convention for the definition of the Lie algebra, 
then condition 3 should be replaced with the condition —i (XY — Y X) € g. 
Properties 1 and 2 show that g is a real vector space, (i.e., a real subspace of 
the space of M,,(C)). Property 3 shows that g is, in fact, a Lie algebra in the 
abstract sense described in Section 2.8. Note that Property 1 applies only to 
real numbers s (compare Definition 2.20). 
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Proof. Point 1 is immediate, since e*(**) = e's)X, which must be in G if X 
is in g. Point 2 is easy to verify if X and Y commute, since, in that case, 
et(X+Y) — etX etY If X and Y do not commute, this argument does not work. 
However, the Lie product formula states that 


et(X+Y) — Jim aa 
m> oo 


tX/m tY/m 


Because X and Y are in the Lie algebra, e and e are in G, as is 
(etX/metY/m)™ since G is a group. However, because G is a matrix Lie group, 
the limit of things in G must be again in G, provided that the limit is invertible. 
Since e(*+¥) is automatically invertible, we conclude that it must be in G. 
This shows that X +Y is in g. 


Now for Point 3. Recall (Proposition 2.4) that #e'*|,_, = X. It follows 


that re oe = XY, and, hence, by the product rule (Exercise 3), 


2 (eY) = (XY )e + (e°Y)(-X) 
t=0 


=XY-YX. 


Now, by Proposition 2.17, eX Ye~ is in g for all t. Furthermore, we have (by 
Points 1 and 2) established that g is a real subspace of M,,(C). This means, 
in particular, that g is a topologically closed subset of M,,(C). It follows that 


ehXVehX _y 


XY -YX = lim 
h—0 h 


belongs to g. o 


Definition 2.19. Given two n x n matrices A and B, the bracket (or com- 
mutator) of A and B, denoted |A, B] , is defined to be 


[A, B] = AB — BA. 


According to Theorem 2.18, the Lie algebra of any matrix Lie group is closed 
under brackets. 

It is important to note that even if the elements of G have complex entries, 
the Lie algebra g of G is not necessarily a complex vector space. That is, for 
X in g, iX may not be in g. For example, elements of SU(n) will, in general, 
have complex entries (i.e., SU(m) is not contained in GL(n;R)). Nevertheless, 
if X is in the Lie algebra su(n), then X* = —X and, so, (iX)* = iX. This 
means that iX is not in su(n) unless X is zero. 


Definition 2.20. A matrix Lie group G is said to be complex if its Lie al- 
gebra g is a complex subspace of M,(C) (i.e., if iX € g for all X € g). 
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Examples of complex groups are GL(n; C), SL(n; C), SO(n; C), and Sp(n; C). 
The condition in Definition 2.20 is equivalent to the condition that G be a 
complex submanifold of GL(n;C). (See Appendix C.) 

We return now to the setting of general, not necessarily complex, matrix 
Lie groups. The following very important theorem tells us that a Lie group 
homomorphism between two Lie groups gives rise in a natural way to a map 
between the corresponding Lie algebras. In particular, this will tell us that 
two isomorphic Lie groups have “the same” Lie algebras (i.e., the Lie algebras 
are isomorphic in the sense of Section 2.8). See Exercise 12. 


Theorem 2.21. Let G and H be matrix Lie groups, with Lie algebras g and 
h, respectively. Suppose that ® : G —> H is a Lie group homomorphism. Then, 
there exists a unique real linear map @: g — h such that 


B(e*) = et) (2.16) 
for all X € g. The map ¢ has following additional properties: 


1. $(AXA-}) = ®(A)4(X)(A)-1, for all X €g, AEG 
2. O([X,Y]) = [o(X), d(¥)], for all X,Y € g 
3. A(X) = FP) o for all X Eg 


Suppose that G, H, and K are matrix Lie groups and ®: H > K and 
Y : G —> H are Lie group homomorphisms. Let A : G > K be the composition 
of ® and Y, A(A) = ®(W(A)). Let 4, Y, and À be the associated Lie algebra 
maps. Then, 


In practice, given a Lie group homomorphism ®, the way one goes about 
computing ¢ is by using Property 3. Of course, since ¢ is (real) linear, it suffices 
to compute ¢ on a basis for g. In the language of differentiable manifolds, 
Property 3 says that ¢ is the derivative (or differential) of ® at the identity, 
which is the standard definition of ¢. (See also Corollary 2.35 in Section 2.7.) 

A linear map with Property 2 is called a Lie algebra homomorphism. 
(See Section 2.8.) This theorem says that every Lie group homomorphism 
gives rise to a Lie algebra homomorphism. We will see eventually that the 
converse is true under certain circumstances. Specifically, suppose that G and 
H are Lie groups and that ¢ : g —> § is a Lie algebra homomorphism. If 
G is simply connected, then there exists a unique Lie group homomorphism 
®:G —> H such that © and ¢ are related as in Theorem 2.21. (The proof of 
this deep result is in Chapter 3.) We now proceed with the proof of Theorem 
2.21. 


Proof. The proof is similar to the proof of Theorem 2.18. Since © is a contin- 
uous group homomorphism, ®(e'* ) will be a one-parameter subgroup of H, 
for each X € g. Thus, by Theorem 2.13, there is a unique matrix Z such that 


De) = e7 (2.17) 
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for all t € R. This Z must lie in h since e? = © (e'*) € H. 
We now define ¢(X) = Z and check in several steps that ¢ has the required 
properties. 


Step 1: (eX) = e), 
This follows from (2.17) and our definition of ¢, by putting t = 1. 


Step 2: (sX) = s¢(X) for all s € R. 
This is immediate, since if &(e'*) = et, then &(e%*) = e'7. 


Step 3: H(X +Y) = 6(X)4+ GY). 
By Steps 1 and 2, 


ett X +Y) — polt(X+Y)] - @ (exv) , 


By the Lie product formula and the fact that ® is a continuous homomor- 
phism, we have 


ett X+Y) — 2( lim eee) ) 


m> 
= lim (@(e*/) aem)". 


m—- co 


However, we then have 


etHXtY) — lim (ct#20/m tor )/m) T — ot($(X)+9%), 


mM—- CoO 


Differentiating this result at t = 0 gives the desired result. 


Step 4: 6(AXA7) = ®(A)6(X)(A)7}. 
By Steps 1 and 2, 


expte( AXAT!) = exp ¢(tAXA7!) = ®(exptAX A’). 
Using a property of the exponential and Step 1, this becomes 
expt@(AXA7') = 6(Ae™ A7!) = B(A) (eX )6(A)™! 
= @(A)ePOB(A)“1. 
Differentiating this at t = 0 gives the desired result. 


Step 5: $([X,Y]) = [d(X), (Y)]. 
Recall from the proof of Theorem 2.18 that 


[X,Y] = d exyetX 


dt t=0 
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Hence, 


$ 


t=0 


$ (X, Y]) = o( Server a 


) = 4 p(X Y ex) 
t=0 


where we have used the fact that a derivative commutes with a linear trans- 
formation. 
Now, by Step 4, 


d 


SLYN = Ealen) 
t=0 
— Í tox) -t4(X) 
Tasg] 


[o(X), o(Y)]. 


Step 6: P(X) = $e), o 
This follows fon ‘(2. ra nd o our definition of ¢. 


Step 7: œ is the unique real linear map such that P(e”) = e?*), 
Suppose that w is another such map. Then, 


etx) — etx) — &(e'*) 


so that 


Thus, by Step 6, y coincides with ¢. 


Step 8: X= pow. 
For any X € g, 


A (e*) = &(W(e'*)) = a(e00) = tx), 


Thus, A(X) = (#)(X)). 


Definition 2.22 (The Adjoint Mapping). Let G be a matrix Lie group, 
with Lie algebra g. Then, for each A € G, define a linear map Ada: g > g 
by the formula 

Ad4(X) = AXA™. 


Proposition 2.23. Let G be a matriz Lie group, with Lie algebra g. Let GL(g) 
denote the group of all invertible linear transformations of g. Then, for each 
A € G, Ady is an invertible linear transformation of g with inverse Ad4-1, 
and the map A + Ada is a group homomorphism of G into GL(g). Further- 
more, for each A € G, Ada satisfies Ada([X,Y]) = [Ad4(X), Ad4(Y)] for 
all X,Y €g. 
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Proof. Easy. Note that Proposition 2.17 guarantees that Ad4(X) is actually 
in g for all X € g. o 


Since g is a real vector space with some dimension k, GL(g) is essentially 
the same as GL(k; R). Thus, we will regard GL(g) as a matrix Lie group. 
It is easy to show that Ad : G —> GL(g) is continuous, and so is a Lie group 
homomorphism. By Theorem 2.21, there is an associated real linear map X > 
adx from the Lie algebra of G to the Lie algebra of GL(g) (i.e., from g to gl(g)), 
with the property that 

eix = Ad (e¥). 
Here, gl(g) is the Lie algebra of GL(g), namely the space of all linear maps of 
g to itself. 


Proposition 2.24. Let G be a matrix Lie group, let g be its Lie algebra, 
and let Ad : G — GL(g) be the Lie group homomorphism defined above. Let 
ad : g — gl(g) be the associated Lie algebra map. Then, for all X,Y € g 


adx(Y) = [X,Y]. (2.18) 


Proof. Recall that by Point 3 of Theorem 2.21, ad can be computed as follows: 


Thus, 
adx(Y) = oy a(eX\(Y) = fo tXy_ tx 
dt 0o dt t=0 
= [X, Y], 
which is what we wanted to prove. O 


We have proved, as a consequence of Theorem 2.21 and Proposition 2.24, 
the following result, which we will make use of later. 


Proposition 2.25. For any X in M,,(C), letadx : Mn(C) + Mn (C) be given 
by adxY = [X,Y]. Then, for any Y in M,(C), we have 


eIxy = Adex Y =eXYe™%. 


This result can also be proved by direct calculation—see Exercise 19. 


2.7 The Exponential Mapping 


Definition 2.26. If G is a matriz Lie group with Lie algebra g, then the 
exponential mapping for G is the map 


exp:g> G. 
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That is, the exponential mapping for G is the matrix exponential restricted 
to the Lie algebra g of G. We have shown (Theorem 2.9) that every matrix in 
GL(n; C) is the exponential of some n x n matrix. Nevertheless, if G C GL(n; C) 
is a closed subgroup, there may exist A in G such that there is no X in the 
Lie algebra g of G with exp X = A. Consider, for example, the matrix 


= ae 
Py) 

in SL(2;C). I claim that there exists no X € sl(2;C) with expX = A. To 
see this, consider an arbitrary matrix X in sl(2;C). Since trace(X) = 0, the 
eigenvalues of X are negatives of each other. There are then two possibilities. 
First, the eigenvalues of X could both be zero. In that case, exp X will have 
1 as an eigenvalue and, so, exp X # A. Second, the eigenvalues of X could be 
of the form (A, —A), with À being a nonzero complex number. In that case, X 
has distinct eigenvalues and is, therefore, diagonalizable. It follows that exp X 
is also diagonalizable. However, A is not diagonalizable. (The eigenvalues of A 
are —1 and —1; if it were diagonalizable it would have to be —J.) This shows 
that exp X # A. (See also Exercises 26, 27, 29, 30, and 31.) 

We see, then, that the exponential mapping for a matrix Lie group G 
does not necessarily map g onto G. Furthermore, the exponential mapping 
may not be one-to-one on g. Nevertheless, it provides a crucial mechanism for 
passing information between the group and the Lie algebra. Indeed, we will 
see (Corollary 2.29) that the exponential mapping is locally one-to-one and 
onto, a result that will be essential, for example, in Chapter 3. 


Theorem 2.27. For 0 < e < In2, let Us = {X € M,(C)|||X|| <€} and let 
V. = exp(U.). Suppose G C GL(n;C) is a matriz Lie group with Lie algebra 
g. Then there exists € € (0,1n2) such that for all A € V:, A is in G if and 
only if log A is in g. 


The condition € < In2 guarantees (Theorem 2.7) that for all X € Va, 
log(exp X) is defined and equal to X. 

Note that if X = log A is in g, then A = exp X is in G. Thus, the content 
of the theorem is that for some £, having A in V; N G implies that log A must 
be in g. There are several important consequences of this theorem, described 
after the proof. 


Proof. We begin with a lemma. 


Lemma 2.28. Suppose Bm are elements of G and that Bm —> I. Let Ym = 
log Bm, which is defined for all sufficiently large m. Suppose that Ym is 
nonzero for allm and that Ym/||Ym|| > Y € Mn(C). Then, Y € g. 


Proof. To show that Y € g, we must show that exptY € G for allt € R. 
As m > œ, (t/ ||Ym]||) Ym 7 tY. Note that since Bm > I, Ym — 0, and, so, 
|Yin|| + 0. Thus, we can find integers km such that (km ||Ym||) > t. Then, 
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exp(kmYm) = exp [o m || Yml) —— | > exp (tY). 


TA | 


However, exp(kmYm) = exp(Ym)*™ = (Bm)*”" € G and G is closed, and we 
conclude that exp(tY) € G. o 


Let us think of M„(C) as C”? = R2"’. Then, g is a subspace of R2”. 
Let D denote the orthogonal complement of g with respect to the usual inner 
product on R?””, Consider the map ®:g@ D > GL(n;C) given by 


® (X,Y) =e*e". 


Of course, we can identify g @ D with R2””. Moreover, GL(n;C) is an open 
subset of M,,(C) = R2"”. Thus, we can regard ® as a map from R2”” to itself. 
Now, using the properties of the matrix exponential, we see that 


4 sux,0)) =X, 
dt t=0 

d 

Laot) =Y. 
dt t=0 


This shows that the derivative of ® at the point 0 € R?” is the identity. 
(Recall that the derivative at a point of a function from R2”” to itself is a 
linear map of R??? to itself, in this case the identity map.) In particular, the 
derivative of ® at 0 is invertible. Thus, the inverse function theorem says that 
® has a continuous local inverse, defined in a neighborhood of J. 

Now, as we have remarked, what we need to prove is that for some e, 
A € V: OG implies log A € g. Suppose this is not the case. Then we can find 
a sequence Am in G such that Am — I as m — œ and such that for all m, 
log Am ¢ g. Using the local inverse of the map ©, we can write Am (for all 
sufficiently large m) as 


Am =e*meY™, Xm €g, Ym € D, 


in such a way that Xm and Ym tend to zero as m tends to infinity. We must 
have Ym #0, since otherwise we would have log Am = Xm € g. 

Now, let Bm = exp(—Xm)Am = exp(Ym). Then, Bm is in G and Bm > I 
as m — oo. Since the unit sphere in D is compact, we can choose a subsequence 
of the Ym’s (still called Ym) so that Ym/ ||Ym|| converges to some Y € D, with 
|Y]| = 1. Then, by the lemma, Y € g. This is a contradiction, because D 
is the orthogonal complement of g. Thus, there must be some € such that 
log A € g for all A in V.NG. o 


Corollary 2.29. If G is a matrix Lie group with Lie algebra g, there exists 
a neighborhood U of O in g and a neighborhood V of I in G such that the 
exponential mapping takes U homeomorphically onto V. 
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Proof. Let € be such that Theorem 2.27 holds and set U = U, N g and V = 
Va N G. The theorem implies that exp takes U onto V. Furthermore, exp is a 
homeomorphism of U onto V, since there is a continuous inverse map, namely, 
the restriction of the matrix logarithm to V. 


Definition 2.30. JfU and V are as in Corollary 2.29, then the inverse map 
exp! : V + g is called the logarithm for G. 


Corollary 2.31. If G is a connected matrix Lie group, then every element A 
of G can be written in the form 


A= eX1eX2...e%m (2.19) 
for some X1,Xo,...,Xm ing. 
Even if G is connected, it is definitely not the case in general that every 


element of G can be written as single exponential, A = exp X (with X € g), 
as the example given earlier in this section shows. 


Proof. Since G is connected, we can find a continuous path A(t) in G with 
A(0) = I and A(1) = A. Let V be a neighborhood of J in G as in Corollary 
2.29, so that every element of V is the exponential of an element of g. A 
standard argument using the compactness of the interval [0,1] shows that we 
can pick a sequence of numbers to,...,tm with 0 = to < ty <- < tm = 1 
such that 
Aj,’ At EV 
for all k =1,...,m. Then, 
A= (Ap At, (Ag Ata) (A) Atm): 


If we choose X, € g with exp X, = A; ! 


tp At. (kK =1,...,m), we have 


A=er%!...¢e%m, 


Corollary 2.32. Suppose G is a connected matrix Lie group, H is a matrix 
Lie group, and ®, and ®2 are Lie group homomorphisms of G into H. Let 
1 and 2 be the associated Lie algebra homomorphisms. If pı = 2, then 
®, = ®. 


Proof. Let g be any element of G. Since G is connected, Corollary 2.31 tells 
us that g can be written as g = e*'e*2 ..- eX", with X; € g. Then, 


®1(g) = &:(e*")--- Bi (e*”) 
— ef X)... eh (Xa) 
= e?2(X1) bia et2(Xn) 
= b2(e*') cate bo(e*") 
= 2(g). 
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We are now in a position to obtain Theorem 1.19 of Chapter 1 as a con- 
sequence of Theorem 2.27. 


Corollary 2.33. Every matrix Lie group G is a smooth embedded submanifold 
of M,(C) and, hence, a Lie group. 


Proof. Let € € (0,1n 2) be such that Theorem 2.27 holds. Then for any Ag € G, 
consider the neighborhood AoV; of Ao in M,,(C). Note that A € AoV- if and 
only if Ag “A € Va. Define a local coordinate system on AgV. by writing each 
A € AoV. as A = Ag exp X, for X € Us C M,(C). It follows from Theorem 
2.27 that (for A € AoV.) A € G if and only if X € g. This means that in 
this local coordinate system defined near Ag, G looks like the subspace g of 
M,,(C). Since we can find such local coordinates near any point Ag in G, G 
is an embedded submanifold of M,,(C). This shows, as discussed in Section 
C.2.6, that G is a Lie group. o 


Corollary 2.33 implies that a matrix Lie group G is necessarily locally 
path-connected. It follows that G is connected (in the usual topological sense) 
if and only if it is path-connected. Thus our definition of connectedness in 
Section 1.7 (which was actually path-connectedness) is equivalent to the usual 
topological definition. 


Corollary 2.34. Every continuous homomorphism between two matriz Lie 
groups is smooth. 


Proof. Given A € G, we write nearby elements B € G (as in the proof of 
Corollary 2.33) as B = Aexp X, X € g. Then, 


(B) = (A) (exp X) = P(A) exp(o(X)). 


This says that in exponential coordinates near A, ® is a composition of the 
linear map ¢, the exponential mapping, and multiplication on the left by 
(A), all of which are smooth. This shows that © is smooth near any point 
AEG. o 


Corollary 2.35. Suppose G C GL(n; C) is a matriz Lie group with Lie algebra 
g. Then, a matrix X is in g if and only if there exists a smooth curve y in 
M,(C) such that 1) y(t) lies in G for all t; 2) y(0) = I; 3) dy/dth=0 = X. 
Thus, g is the tangent space at the identity to G. 


See Proposition C.3 for a description of the tangent space of an embedded 
submanifold in terms of derivatives of smooth curves. 


Proof. If X is in g, then we may take y(t) = exp(tX) and then 7(0) = I and 
dy/dt|:=0 = X. In the other direction, suppose that y(t) is a smooth curve in 
G with 7(0) = I. Then, by Theorem 2.27, log(7(t)) is in g for all sufficiently 
small t. Now, g is a real subspace of M,,(C) and, therefore, also a topologically 
closed subset of M„ (C). Thus, the quantity 
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dlog(y(t)) 
dt 


is again in g. However, 


log(y(t)) = (y(t) — I) - 


If we differentiate this term by term (it is not hard to see that this is permitted) 
and apply the product rule, all terms but the first will give zero. (For example, 
the derivative of the second term is —5[(dy/dt)(7(t) — I) + (y(t) — I) (dy/dt)], 
which is equal to zero at t = 0.) Thus, we obtain that 


dlog(y(t)) 
dt 


= 


= Eg. 
o Œt 


t=0 


2.8 Lie Algebras 


We now consider the abstract notion of a Lie algebra, not necessarily given 
to us as the Lie algebra of a matrix Lie group. Proposition 2.37 shows that 
the Lie algebra of a matrix Lie group is indeed a Lie algebra in the abstract 
sense. 


Definition 2.36. A finite-dimensional real or complex Lie algebra is 
a finite-dimensional real or complex vector space g, together with a map |-,-] 
from g x g into g, with the following properties: 


1. |-,-] is bilinear. 
2. [X,Y] = —[Y, X] for all X,Y € g. 
3. (X,[Y, ZI] FY Z XJ] + [Z, [X,Y] =0 for all X,Y,Z eg. 


Condition 2 is called “skew symmetry.” Condition 3 is called the Jacobi 
identity. Note also that Condition 2 implies that [X, X] = 0 for all X € g. 
We will deal only with finite-dimensional Lie algebras and will from now on 
interpret “Lie algebra” as “finite-dimensional Lie algebra.” 

It should be emphasized here that g can be any vector space (not neces- 
sarily a space of matrices) and that the “bracket” operation [-,-] can be any 
bilinear, skew-symmetric map that satisfies the Jacobi identity. In particular, 
[X,Y] is not necessarily equal to XY —Y X; indeed, the expression XY — Y X 
does not even make sense in general, since g does not necessarily have a prod- 
uct operation defined on it. For example, let g = R and define [z, y] to be 
x xy, where x is the cross product (vector product). This operation is, clearly, 
bilinear and skew-symmetric, and it can be checked that is satisfies the Jacobi 
identity. There is, so far as I can see, no product operation “xy” on R3 such 
that £ x y = ry — yz. 
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Although the bracket operation in a Lie algebra does not have to be given 
to us as [X,Y] = XY —YX, it is possible to construct Lie algebras in this way. 
That is to say, if A is an associative algebra and we define [-,-]}: Ax A> A 
by [X,Y] = XY — YX, then this operation does, indeed, make A into a Lie 
algebra. This operation is clearly bilinear and skew-symmetric, and it is a 
simple computation to check, using the associativity of A, the Jacobi identity. 
For any Lie algebra, the Jacobi identity means that the bracket operation 
behaves as if it were XY — Y X, even if it is not actually defined this way. 
Indeed, it can be shown that every Lie algebra g can be embedded into some 
associative algebra A in such a way that the bracket on g corresponds to the 
operation XY —YX in A. 

If g is a Lie algebra, we can think of the bracket operation as making g 
into an algebra in the general sense. This algebra, however, is not associative. 
The Jacobi identity is to be thought of as a substitute for associativity. 


Proposition 2.37. The space M,,(R) of all n x n real matrices is a real Lie 
algebra with respect to the bracket operation [A,B] = AB — BA. The space 
M,(C) of alln x n complex matrices is a complex Lie algebra with respect to 
the same bracket operation. 

Let V be a finite-dimensional real or complex vector space, and let gl(V) 
denote the space of linear maps of V into itself. Then, gl(V) becomes a real 
or complex Lie algebra with the bracket operation [A, B] = AB — BA. 


Proof. The only nontrivial point is the Jacobi identity. The only way to prove 
this is to write everything out and see, and this is best left to the reader. Note 
that each double bracket generates 4 terms, for a total of 12. Each of the six 
orderings of {X,Y, Z} occurs twice, once with a plus sign and once with a 
minus sign. Note that the associativity of the matrix product is essential to 
the proof. o 


Definition 2.38. A subalgebra of a real or complex Lie algebra g is a sub- 
space h of g such that [H,, H2] € h for all Hı and Hz € b. If g is a complex 
Lie algebra and h is a real subspace of g which is closed under brackets, then 
h is said to be a real subalgebra of g. 

If g and h are Lie algebras, then a linear map ọ : g —> b is called a 
Lie algebra homomorphism if ọ([X,Y]) = [¢(X), (Y )] for all X,Y € g. 
If, in addition, ọ is one-to-one and onto, then ọ is called a Lie algebra 
isomorphism. A Lie algebra isomorphism of a Lie algebra with itself is called 
a Lie algebra automorphism. 


A subalgebra of a Lie algebra is, again, a Lie algebra. A real subalgebra 
of a complex Lie algebra is a real Lie algebra. The inverse of a Lie algebra 
isomorphism is, again, a Lie algebra isomorphism. 


Proposition 2.39. The Lie algebra g of a matriz Lie group G is a real Lie 
algebra. 
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Proof. By Theorem 2.18, g is a real subalgebra of the space M,,(C) of all 
complex matrices and is, thus, a real Lie algebra. o 


Theorem 2.40 (Ado). Every finite-dimensional real Lie algebra is isomor- 
phic to a subalgebra of gl(n;R). Every finite-dimensional complex Lie algebra 
is isomorphic to a complex subalgebra of gl(n;C). 


This deep theorem is proved, for example, in Varadarajan (1974). The 
proof is beyond the scope of this book and requires a careful examination of 
the structure of complex Lie algebras. The theorem tells us that every Lie 
algebra is (isomorphic to) a Lie algebra of matrices. This is in contrast to the 
situation for Lie groups, where most, but not all, Lie groups are matrix Lie 
groups—see Section C.3. 

We now introduce the abstract Lie algebra version of the map “ad,” which 
we introduced earlier for the Lie algebra of a matrix Lie group. 


Definition 2.41. Let g be a Lie algebra. For X € g, define a linear map 
adx : g > g by 

adx(Y) = [X,Y]. 
Thus, “ad” (i.e., the map X —> adx) can be viewed as a linear map from g 
into gl(g), where gl(g) denotes the space of linear operators from g to g. 


Since adx (Y) is just [X,Y], it might seem foolish to introduce the addi- 
tional “ad” notation. However, thinking of [X,Y] as a linear map in Y for each 
fixed X gives a somewhat different perspective. In any case, the “ad” notation 
is extremely useful in some situations. For example, instead of writing 


[X, [X, [X, [X, Y], 
we can now write 
(adx)* (Y). 
This sort of notation will be essential in Chapter 3. 
Proposition 2.42. If g is a Lie algebra, then 
adrx,y] = adxady = adyady = [adx, ady]; 
that is, ad: g > gl(g) is a Lie algebra homomorphism. 


Proof. Observe that 
adix,y\(Z) = [[X, Y], Z], 


whereas 
[adx , ady](Z) = [X, [Y, Z]] — [Y, [X, Z]]. 
So, we want to show that 
[X, Y], Z] z [X, [Y, Z| [Y, [X, Z]] 
or, equivalently, 
0 = [X,[Y, Z]] + [Y; [Z, X]] + [Z, [X,Y], 
which is exactly the Jacobi identity. o 
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2.8.1 Structure constants 


Let g be a finite-dimensional real or complex Lie algebra, and let Xj,...,Xn 
be a basis for g (as a vector space). Then, for each i and j, [X;, Xj] can be 
written uniquely in the form 


n 
[X;, X;] = 5 Cijk Xk- 
k=1 
The constants cj, are called the structure constants of g (with respect 
to the chosen basis). Clearly, the structure constants determine the bracket 
operation on g. In some of the literature, the structure constants play an 
important role, although we will not have much necessity to use them in this 
book. (They appear mainly in Appendix D, where the quantities ¢;;, are the 
structure constants for the Lie algebra so(3).) In the physics literature, the 
structure constants are defined as [X;, Xj] = V—-1)°, Cijk Xk, reflecting the 
factor of /—1 difference between the physics definition of the Lie algebra and 
our own. 
The structure constants satisfy the following two conditions: 


Cijk + Cjik = 9, 
` (CijmCmkl + CjkmCmil + CkimCmjl) = 0 
m 


for all i, j, k,l. The first of these conditions comes from the skew symmetry 
of the bracket, and the second comes from the Jacobi identity. (The reader is 
invited to verify these conditions for himself.) 


2.8.2 Direct sums 


If gı and ge are Lie algebras, we can define the direct sum of gı and go as 
follows. We consider the direct sum of gı and g2 in the vector space sense, 
and we define a bracket operation on gı ® g2 by 


(X1, X2), Y1, Y2)] = ([X1, Ya], [X2, Yo). 


It is straightforward to verify that this operation satisfies the Jacobi identity 
and makes gı © ge into a Lie algebra. If Gi C GL(n1; C) and Gz C GL(n2; ©) 
are matrix Lie groups and G1 x Gz is their direct product (regarded as a 
subgroup of GL(n1 + n2; C) in the obvious way), then it is easily verified that 
the Lie algebra of G x G2 is isomorphic to g; ® ga. 


2.9 The Complexification of a Real Lie Algebra 


Definition 2.43. If V is a finite-dimensional real vector space, then the com- 
plexification of V, denoted Vc, is the space of formal linear combinations 
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v1 + ive, 


with v1,vg € V. This becomes a real vector space in the obvious way and 
becomes a complex vector space if we define 


i(vy + iv2) = —v + iv. 


We could more pedantically define Vc to be the space of ordered pairs 
(v1, v2) with v1, v2 € V, but this is notationally cumbersome. It is straightfor- 
ward to verify that the above definition really makes Vc into a complex vector 
space. We will regard V as a real subspace of Vc in the obvious way. 


Proposition 2.44. Let g be a finite-dimensional real Lie algebra and gc its 
complezification (as a real vector space). Then, the bracket operation on g has 
a unique extension to gc which makes gc into a complex Lie algebra. The 
complex Lie algebra gc is called the complezification of the real Lie algebra 
g- 


Proof. The uniqueness of the extension is obvious, since if the bracket opera- 
tion on gc is to be bilinear, then it must be given by 


[X +iX2, Yı + iY2] = ((X4, Yi] 5 [X2, Y2]) +i (X1, Y2] + [X2, Y1]) ; (2.20) 


To show existence, we must now check that (2.20) is really bilinear and skew 
symmetric and that it satisfies the Jacobi identity. It is clear that (2.20) is 
real bilinear, and skew-symmetric. The skew symmetry means that if (2.20) 
is complex linear in the first factor, it is also complex linear in the second 
factor. Thus, we need only show that 


[i(X + iX2), Vi + iY2] = i [X1 + iX2, Yı + iY]. (2.21) 
The left-hand side of (2.21) is 
[-X2 + iX1, Yi +iY2] = (- [X2, Y1] — [X1 Y2)) + i ((X1, Yı] — [X2, Y2)) , 
whereas the right-hand side of (2.21) is 


i {((X1, Yi] — [X2, Y2]) + i ([X2, Ya] + [X1, Y2) } 
= (— [X2, Y1] — [X1, Y2]) + i ((X1, Y1] — [X2, Y2]), 


and, indeed, these are equal. 

It remains to check the Jacobi identity. Of course, the Jacobi identity holds 
if X,Y, and Z are in g. However, observe that the expression on the left-hand 
side of the Jacobi identity is (complex!) linear in X for fixed Y and Z. It 
follows that the Jacobi identity holds if X is in gc, and Y and Z are in g. The 
same argument then shows that we can extend to Y in gc, and then to Z in 
gc. Thus, the Jacobi identity holds in gc. o 
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Proposition 2.45. The Lie algebras gl(n;C), sl(n;C), so(n;C), and sp(n; ©) 
are complex Lie algebras. In addition, we have the following isomorphisms of 
complex Lie algebras: 


gl (n; R)e = gi(n;C), 
u(n)e = gl(n; C), 
su(n)c = sl(n;C), 
sl(n;R)¢ = sl(n; C), 
so(n)c = so(n;C), 
sp(n;R)c = sp(n;C), 
sp(n)c = sp(n;C). 


Proof. From the computations in the previous section, we see easily that the 
specified Lie algebras are, in fact, complex subalgebras of gl(n;C) and hence 
are complex Lie algebras. 

Now, gl(n;C) is the space of all n x n complex matrices, whereas gl (n; R) 
is the space of all n x n real matrices. Clearly, then, every X € gl(n;C) can 
be written uniquely in the form X; +iX2, with X1, X2 € gl (n; R). This gives 
us a complex vector space isomorphism of gl (n;R)¢ with gl(n;C), and it is a 
triviality to check that this is a Lie algebra isomorphism. 

On the other hand, u(n) is the space of all n x n complex skew-self-adjoint 
matrices. However, if X is any n x n complex matrix, then 


X—-X* X+4+xX* 

oy aes ae 

where (X — X*)/2 and (X + X*)/2i are both skew. Thus, X can be written 

as a skew matrix plus 7 times a skew matrix, and it is easy to see that this 

decomposition is unique. Thus, every X in gl(n;C) can be written uniquely 

as Xı +iX2, with X, and X in u(n). It follows that u(n)c = gl(n;C). If X 

has trace zero, then so do X; and X3, which shows that su(n)c = sl(n; C). 
The verification of the remaining isomorphisms is similar and is left as an 

exercise to the reader. o 


X= 


Note that u(n)c & gl(n;R)c & gl(n; C). However, u(n) is not isomorphic 
to gl(n;R), except when n = 1. The real Lie algebras u(n) and gl(n;R) are 
called real forms of the complex Lie algebra gl(n;C). A given complex Lie 
algebra may have several nonisomorphic real forms. See Exercise 17. 

Physicists do not always clearly distinguish between a matrix Lie group 
and its (real) Lie algebra, or between a real Lie algebra and its complexifi- 
cation. Thus, for example, some references in the physics literature to SU(2) 
actually refer to the complexified Lie algebra, sl(2;C). 


2.10 Exercises 


1. The Schwarz inequality from elementary analysis tells us that for all u = 
(u1, ..., Un) and v = (v1,...,Un) in C”, we have 
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juv +--+ + Unn < (>: ma?) (>: mf] 
k=1 k=1 


Use this to verify that ||XY|| < ||X||||Y|| for all X,Y € M,(C), where 
the norm ||X|| of a matrix X is defined by (2.2). 

. Show that for X € M,,(C) and any orthonormal basis {u1,..., Un} of C”, 
IXI? = Efra (uj, Xux)|, where ||X|| is defined by (2.2). Now show 
that if v is an eigenvector for X with eigenvalue À, then || < |X|]. 

. The product rule. Recall that a matrix-valued function A(t) is said to 
be smooth if each A;;(t) is smooth. The derivative of such a function is 


defined as 
dA _ dAi; 
CS pram 


or, equivalently, 


Let A(t) and B(t) be two such functions. Prove that A(t)B(t) is again 
smooth and that 


. Show that for all X € M,(C), 


m 
lim [z+ *| =e”. 
m> m 

. Using Theorem B.7, show that every n x n complex matrix A is the limit 
of a sequence of diagonalizable matrices. 

Hint: If the characteristic polynomial of A has n distinct roots, then A is 
diagonalizable. 

. Show that every 2 x 2 matrix X with trace zero satisfies 


X? = —det(X)I. 


If X is 2 x 2 with trace zero, show by direct calculation using the power 
series for the exponential that 


x sin vdet X 
6s = cos( Vdet x) I+ at ee a X. (2.22) 


Use this to give an alternative derivation of the result in (2.7). 

Notes: Since the functions cos@ and sinĝ/0 are even functions of 0, the 
value of (2.22) is independent of the choice of the square root of det X. 
The value of the coefficient of X in (2.22) is to be interpreted as 1 when 
det X = 0, in accordance with the limit limg_,9 sin 0/0 = 1. 
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11. 


12. 
13. 


14. 


15. 


16. 
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Use the result of Exercise 6 to compute the exponential of the matrix 


ny 


Hint: Write X as the sum of a multiple of the identity and a matrix with 
trace zero. 


. A matrix A is said to be unipotent if A — I is nilpotent (i.e., if A is of the 


form A= I+ N, with N nilpotent). Note that log A is defined whenever 
A is unipotent, because the series in Definition 2.6 terminates. 

(a) Show that if A is unipotent, then log A is nilpotent. 

(b) Show that if X is nilpotent, then e* is unipotent. 

(c) Show that if A is unipotent, then exp(log A) = A and that if X is 
nilpotent, then log(exp X) = X. 

Hint: Let A(t) = I+ t(A — J). Show that exp(log(A(t))) depends poly- 
nomially on t and that exp(log(A(t))) = A(t) for all sufficiently small 
t. 


. Show that every invertible n x n matrix A can be written as A = e* for 


some X € Mn (C). 

Hint: Theorem B.5 implies that A is similar to a block-diagonal matrix 
in which each block is of the form AZ + Ny, with N) being nilpotent. Use 
this result and Exercise 8. 

Give an example of a matrix Lie group G and a matrix X such that 
e* €G, but X ¢ g. 

Suppose G is a matrix Lie group in GL(n; C) and let g be its Lie algebra. 
Suppose that A is in G and that ||A — I|| < 1, so that the power series for 
log A is convergent. Is it necessarily the case that log A is in g? Prove or 
give a counterexample. 

Show that two isomorphic matrix Lie groups have isomorphic Lie algebras. 
The Lie algebra so(3;1). Write out explicitly the general form of a 4 x 4 
real matrix in so(3; 1). 

Verify directly that Proposition 2.17 and Theorem 2.18 hold for the Lie 
algebra of SU(n). 

The Lie algebra su(2). Show that the following matrices form a basis for 
the real Lie algebra su(2): 


i 0 01 Di 
m= (iapa heh) 


Compute [E;, E2], [E2, E3], and [E3, E1]. Show that there is an invertible 
linear map ¢ : su(2) > R? such that ¢([X,Y]) = ¢(X) x (Y) for all 
X,Y € su(2), where x denotes the cross product (vector product) on R°. 
The Lie algebras su(2) and so(3). Show that the real Lie algebras su(2) 
and so(3) are isomorphic. 

Note: Nevertheless, the corresponding groups SU(2) and SO(3) are not 
isomorphic. (Rather, SO(3) is isomorphic to SU(2)/ {J, —J}.) 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 
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The Lie algebras su(2) and sl(2;R). Show that su(2) and sl(2;R) are not 
isomorphic Lie algebras, even though su(2)c & sl(2;R)c S sl(2;C). 

Hint: Using Exercise 15, show that su(2) has no two-dimensional subal- 
gebras. 

Let G be a matrix Lie group and let g be its Lie algebra. For each A € G, 
show that Ad, is a Lie algebra automorphism of g. 

(“Ad” and “ad”) Let X and Y be n x n matrices. Show by induction that 


m 


eax 7) =o (F Ye, 


k=0 


where 
(adx)” (Y) = [X, Pons [X, [X, Y]] ae J. 


m 


Now, show by direct computation that 
etx (Y) = Adex (Y) =e*Ye™*. 


Assume that it is legal to multiply power series term by term. (This result 
was obtained indirectly in Proposition 2.25.) 
Hint: Recall that Pascal’s Triangle gives a relationship between numbers 
of the form (**) and numbers of the form (%). 
If g is a Lie algebra, then a subalgebra b of g is called an ideal if [X, H] € h 
for all X € g and H € b. If ¢ : gı > ge is a Lie algebra homomorphism, 
show that ker ¢ is an ideal in gı. 
Classify up to isomorphism all one-dimensional and two-dimensional real 
Lie algebras. (There is one isomorphism class of one-dimensional algebras 
and two isomorphism classes of two-dimensional algebras.) 
Show that for any Lie algebra g and any X in g, adx is a derivation of g; 
that is, 

adx ([Y, Z]) = [adx (Y), Z] + [Y, adx (Z) 


for all Y and Z in g. 

The complezification of a real Lie algebra. Let g be a real Lie algebra, gc 
its complexification, and h an arbitrary complex Lie algebra. Show that 
every real Lie algebra homomorphism of g into h extends uniquely to a 
complex Lie algebra homomorphism of gc into b. (This is the universal 
property of the complexification of a real Lie algebra. This property can 
be used as an alternative definition of the complexification.) 

If g is a Lie algebra, the center of g is the set of all Z € g such that 
[X, Z] = 0 for all X € g. Show that the center of g is an ideal (as defined 
in Exercise 20). 

Suppose that G is a connected, commutative matrix Lie group with Lie 
algebra g. Show that the exponential mapping for G maps g onto G. 


26. 


27. 


28. 


29. 


30. 


31. 


2 Lie Algebras and the Exponential Mapping 


The exponential mapping for the Heisenberg group. Show that the ex- 
ponential mapping from the Lie algebra of the Heisenberg group to the 
Heisenberg group is one-to-one and onto. 

The exponential mapping for U(n). Show that the exponential mapping 
from u(n) to U(n) is onto, but not one-to-one. (Note that this shows that 
U(n) is connected.) 

Hint: Every unitary matrix has an orthonormal basis of eigenvectors. 
Consider the space gl(n; C) of all nxn complex matrices. As usual, for X € 
gl(n;C), define adx : gl(n;C) > gl(n;C) by adx(Y) = [X,Y]. Suppose 
that X is a diagonalizable matrix. Show, then, that adx is diagonalizable 
as an operator on gl(n;C). 

Hint: Consider first the case where X is actually diagonal. 

Note: The problem of diagonalizing adx is an important one that we will 
encounter again in Chapter 6, when we consider semisimple Lie algebras. 
Show explicitly that exp : so(3) > SO(3) is onto. 

Hint: Using Exercise 16 from Chapter 1, show that in a suitable orthonor- 
mal basis, R is of the form 


1 0 0 
R= |0 cos sin 
0 — sin ĝ cos 


The exponential mapping for SL(2; R). Show that the image of the ex- 
ponential mapping for SL(2;R) consists of precisely those matrices A € 
SL(2; R) such that trace (A) > —2, together with the matrix —I (which 
has trace —2). To do this, consider the possibilities for the eigenvalues of a 
matrix in the Lie algebra sl(2;R) and in the group SL(2; R). In the Lie al- 
gebra, show that the eigenvalues are of the form (A, —A) or (ià, —2A), with 
A real. In the group, show that the eigenvalues are of the form (a,1/a) 
or (—a,—1/a), with a real and positive, or of the form (etf, e74), with 
0 real. The case of a repeated eigenvalue ((0,0) in the Lie algebra and 
(1,1) or (—1,—1) in the group) will have to be treated separately using 
the Jordan canonical form (Section B.4). 

Show that the image of the exponential mapping is not dense in SL(2; R). 
Determine the image of the exponential mapping for SL(2; C). Is the image 
of the exponential mapping dense in SL(2; C)? 
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The Baker—Campbell—Hausdorff Formula 


In this chapter, we will, as usual, restrict our attention mainly to matrix Lie 
groups. Nevertheless, the proofs of the main results are the same for general 
Lie groups, provided one has already established the basic results about the 
Lie algebra and the exponential mapping for general Lie groups. 


3.1 The Baker—Campbell—Hausdorff Formula for the 
Heisenberg Group 


A crucial result of this chapter will be the following: Let G and H be matrix 
Lie groups, with Lie algebras g and h, and suppose that G is simply connected. 
Then, if ọ : g > h is a Lie algebra homomorphism, there exists a unique Lie 
group homomorphism ® : G > H such that (exp X) = exp(¢(X)) for all X 
in g. (This is Theorem 3.7 in Section 3.6.) This result is extremely important 
because it implies that if G is simply connected, then there is a natural one-to- 
one correspondence between the representations of G and the representations 
of its Lie algebra g (as explained in Chapter 4). In practice, it is much easier 
to determine the representations of the Lie algebra than to determine directly 
the representations of the corresponding group. 

This result (relating Lie algebra homomorphisms and Lie group homo- 
morphisms) is deep. The “modern” proof (e.g., Varadarajan (1974), Theorem 
2.7.5) makes use of the Frobenius theorem, which is both hard to understand 
and hard to prove (Varadarajan (1974), Section 1.3). Our proof will, instead, 
use the Baker-Campbell—Hausdorff formula, which is more easily stated and 
more easily motivated than the Frobenius theorem, but still deep. 

The idea is the following. The desired group homomorphism ® : G > H 
must satisfy 

® (eX) = 00, (3.1) 


We would like, then, to define ® by this relation. This approach has two 
serious difficulties. First, a given element of G may not be expressible as e* 
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(with X in g), and even if it is, the X may not be unique. Second, it is very 
far from clear why the © in (3.1) (even to the extent it is well defined) should 
be a group homomorphism. 

It is the second issue which the Baker-Campbell—Hausdorff formula ad- 
dresses. (The first issue will be addressed using the simple connectedness of 
G.) Specifically, (one form of) the Baker-Campbell—Hausdorff formula says 
that if X and Y are sufficiently small, then 


log(e¥ e”) = X +¥4+4[X,Y]+ 51X, [X,Y] - 5Y, [X,Y] +. (3.2) 


It is not supposed to be evident at the moment what “---” refers to. The only 
important point is that all of the terms in (3.2) are given in terms of X and 
Y, brackets of X and Y, brackets of brackets involving X and Y, etc. Then, 
because ¢ is a Lie algebra homomorphism, 


¢ (log(e*e” )) = (X) + HY) + ilox), o(Y)] 
+ HeX), [6(X), eY] — HIP), [6(X), oY] + 
= log (e*(*e(¥)) (3.3) 


The relation (3.3) is extremely significant. For, of course, we have 


xY 
eX eY = elog(e e ), 


and so by (3.1), 
® (e*e”) za etllogle* e“)) 


Thus, (3.3) tells us that 
® (e*e) = loa(e*e#) L Oet) — b(e*)B(e"). 


Thus, the Baker-Campbell-Hausdorff formula shows that on elements of the 
form e*, with X small, ® is a group homomorphism. (See Corollary 3.4.) 

Another way of looking at this is to say that the Baker-Campbell- 
Hausdorff formula shows that all the information about the group product, 
at least near the identity, is “encoded” in the Lie algebra. Thus, if ¢ is a Lie 
algebra homomorphism (which by definition preserves the Lie algebra struc- 
ture) and if we define © near the identity by (3.1), then we can expect ® to 
preserve the group structure (i.e., to be a group homomorphism). 

In this section, we will look at how all of this works out in the very special 
case of the Heisenberg group. In the next section, we will consider the general 
situation. 


Theorem 3.1. Suppose X and Y are n x n complex matrices, and that X 
and Y commute with their commutator. That is, suppose that 


[X, [X,Y] = [¥,[X, ¥]] = 0. 


Then, 


pe = eX +Y +3[xXY], 


3.1 The Baker-Campbell—Hausdorff Formula for the Heisenberg Group 65 


This is the special case of (3.2) in which the series terminates after the 
[X,Y] term. 


Proof. Consider X and Y in M,,(C). We will prove that, 
£2 
eX etY — exp (x +tY + p [X, v1) y 


which reduces to the desired result in the case t = 1. Since, by assumption, 
[X,Y] commutes with X and Y, the above relation is equivalent to 


etX etY e74 1X Y] — et(X+Y), (3.4) 


Let us denote by A(t) the left-hand side of (3.4) and by B(t) the right- 
hand side. Our strategy will be to show that A (t) and B (t) satisfy the same 
differential equation, with the same initial conditions. We can see immediately 


that ap 
ah B(X +Y). 


On the other hand, differentiating A(t) by means of the product rule gives 


dA 


Z= ot XetY e74 XY] 4 otk eY Ye- FIX) 


t 


4 bX eY e7 FIX] (4 [X,Y]). (3.5) 


(The correctness of the last term may be verified by differentiating term by 
term.) 


2 
Now, since Y commutes with [X,Y], it also commute with e~ 7Y]_ Thus, 
the second term on the right in (3.5) can be rewritten as 


2 
eX etY e TAY y. 


The first term on the right in (3.5) is more complicated, since X does not 
necessarily commute with etY. However, 


XetY = eY e`tY XetY 
= eY Ad,-ty (X) 


= etY ev tady (X) ' 
However, since [Y, [Y, X]] = — [Y, [X, Y]] = 0, 
eer (X) = X -t[Y,X] = X +t[X,Y], 


with all higher terms being zero. Using the fact that everything commutes 
2 
with e~ 7OY] gives 
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eX XeY e si jx, YJ =e tY e- FX, Y] (X +t[X, YJ). 


Making these substitutions into (3.5) gives 


dA _ 


a etX etY oF X,Y] (X +t[X,Y]) + eX eY eS IX Y]y 


+etXe ty e7 FIX] t[X, Y)) 


etX tY e FIXYI(X +Y) 
Lah XX +Y). 


Thus, A(t) and B(t) satisfy the same differential equation. Moreover, 
A(0) = B(0) = I. Thus, by standard uniqueness results for ordinary dif- 
ferential equations, A(t) = B(t) for all t. Putting t = 1 gives the theorem. O 


Theorem 3.2. Let H denote the Heisenberg group and h its Lie algebra. Let 
G be a matrix Lie group with Lie algebra g and let ọ : h > g be a Lie 
algebra homomorphism. Then, there exists a unique Lie group homomorphism 
®: H +G such that 

® (e*) = e00 


for all X €b. 


Proof. Recall (Exercise 26 in Chapter 2) that the Heisenberg group has the 
very special property that its exponential mapping is one-to-one and onto. 
Let “log” denote the inverse of this map. Define 6 : H > G by the formula 


®(A) = ePllog A) | 


We will show that ® is a Lie group homomorphism. 
If X and Y are in the Lie algebra of the Heisenberg group (3 x 3 strictly 
upper triangular matrices), then [X,Y] is of the form 


00a 
000]; 
000 


such a matrix commutes with both X and Y. Thus, X and Y commute with 
their commutator. Since ¢ is a Lie algebra homomorphism, ¢(X) and ¢ (Y) 
will also commute with their commutator: 


l$ (X) .[6(X), (Y)]] = 4 (X, [X,Y] = 0 
[6 (Y) [6 (X), 4 Y)]] = 4 (Y; [X,Y] = 0. 


We want to show that © is a homomorphism (i.e., that Ẹ(AB) = 
(A)®(B)). To show this, note that A can be written as e* for a unique 
X € 6 and B can be written as eY for a unique Y € h. Thus, by Theorem 3.1, 
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(AB) = B(e%eY) = P(e +Y 441071) | 


Using the definition of ® and the fact that ¢ is a Lie algebra homomorphism, 
we see that 


DAB) = exp(6(X) +0(7) + O): 


Finally, using Theorem 3.1 again (applied to the elements ¢(X) and ¢(Y)), 
we have 
(AB) = etet) = &(A)O(B). 


Thus, ® is a group homomorphism. It is easy to check that ® is continuous 
(by checking that log, exp, and ¢ are all continuous), and, so, ® is a Lie group 
homomorphism. Moreover, ® by definition has the right relationship to @. 
Furthermore, since the exponential mapping is one-to-one and onto, there can 
be at most one ® with ®(e*) = e%(*). o 


3.2 The General Baker-Campbell-Hausdorff Formula 


The importance of the Baker-Campbell-Hausdorff formula lies not in the 
details of the formula, but in the fact that there is a formula and in the fact 
that it gives log(e* eY) in terms of brackets of X and Y, brackets of brackets, 
and so forth. This tells us something very important, namely that (at least 
for elements of the form e*, X small) the group product for a matrix Lie 
group G is completely expressible in terms of the Lie algebra. (This is because 
log(e¥eY) and, hence, also e*eY itself, can be computed in Lie-algebraic 
terms by (3.2).) 

We will actually state and prove an integral form of the Baker-Campbell- 
Hausdorff formula, rather than the series form (3.2). However, the integral 
form is sufficient to obtain the desired result (3.3). (See Corollary 3.4.) The 
series form of the Baker-Campbell—Hausdorff formula is stated precisely and 
proved in Varadarajan (1974), Section 2.15. See also Section 3.5. 

Consider the function 

_ logz 
1) = — i 
This function is defined and analytic in the disk {|z — 1| < 1}, and, thus, for 
z in this set, g(z) can be expressed as 


gle) = So am(z—1)", 
m=0 


for some set of constants {a,,}. This series has radius of convergence one. 
Now, suppose V is a finite-dimensional complex vector space. Choose an 
arbitrary basis for V, so that V can be identified with C” and, thus, the norm 
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of a linear operator on V can be defined. Then, for any operator A on V with 
||A — I|| < 1, we can define 


We are now ready to state the integral form of the Baker-Campbell—Hausdorff 
formula. 


Theorem 3.3 (Baker—Campbell—Hausdorff). For all n x n complex ma- 
trices X and Y with ||X|| and ||Y || sufficiently small, 


log (e*e*) = X + is g(e24x ef@4¥ \(V) dt. (3.6) 
0 


The proof of this theorem is given in Section 3.4 of this chapter. Note that 
e2dx etady and, hence, also g(e*4* e'*4Y) are linear operators on the space 
gl(n; C) of all n x n complex matrices. In (3.6), this operator is being applied 
to the matrix Y. The fact that X and Y are assumed small guarantees that 
e2dx etady is close to the identity operator on gl(n;C) for 0 < t < 1. This 
ensures that g(e*¢* e'@4¥ ) is well defined. 

If X and Y commute, then we expect to have log(e*e” ) = log(e 
X +Y. Exercise 5 shows that the Baker-Campbell—Hausdorff formula indeed 
gives X +Y in that case. 

Formula (3.6) is admittedly horrible looking. However, we are interested 
not in the details of the formula but in the fact that it expresses log(e* eY) 
(and hence e*e” ) in terms of the Lie-algebraic quantities adx and ady. 

Since the goal of the Baker-Campbell—Hausdorff theorem is to compute 
log(e* eY), one may well ask, “Why do we not simply expand both exponen- 
tials and the logarithm in power series and multiply everything out?” Indeed, 
one can do this, and if one does it for the first several terms, one will get 
the same answer as the Baker-Campbell—Hausdorff formula. However, there 
is a serious problem with this approach, namely: How does one know that the 
terms in such an expansion are expressible in terms of commutators? Con- 
sider, for example, the quadratic term. It is clear that this will be a linear 
combination of X?, Y?, XY, and Y X. However, to be expressible in terms of 
commutators, it must actually be a constant times (XY — YX). Of course, 
for the quadratic term, one can just multiply it out and see, and, indeed, one 
gets (X Y-YX)= 5(X , Y]. However, it is far from clear how to prove that 
a similar result occurs for all the higher terms. (See Exercise 6.) Although it 
is possible (but not easy) to prove directly that all terms in the expansion of 
log(e* eY) are expressible in terms of commutators (Proposition 1 in Section 
V.5 of Jacobson (1962)), this is not the approach we will take. 

We now state an important corollary of the Baker-Campbell—Hausdorff 
theorem. 


aay ae 
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Corollary 3.4. Let G be a matriz Lie group and g its Lie algebra. Suppose 
that ọ : g > gl(n;C) is a Lie algebra homomorphism. Then, for all sufficiently 
small X and Y in g, log (e*e”) is in g, and 


¢ [log(e*e” )| = log (eo) ; (3.7) 


Proof. The proof uses the same reasoning as in (3.3). Note that if X and Y lie 
in some Lie algebra g, then adx and ady will leave g invariant, and, therefore, 
so will g(e#4* etèdy )(Y). Thus, whenever formula (3.6) holds, log(e¥ eY) will 
lie in g. It remains only to verify (3.7). The idea is that if ¢ is a Lie algebra 
homomorphism, then it will take a big, messy expression involving “ad” and 
X and Y, and turn it into the same expression with X and Y replaced by 
$ (X) and ¢ (Y). 


More precisely, since ¢ is a Lie algebra homomorphism, 


olY, X] = AY), o(X)] 
or 
lady (X)) = adgy)(o(X)). 
More generally, 
o((ady)"(X)) = (adgcy))” (@(X)). 
This being the case, 


glei (X)) = F E o((ady)™(X)) 
m=0 
= = (adgcrs)” (6(X)) 


Similarly, 
of (etx ett) (Y) = eseonetadann (ACY). 


Assume now that X and Y are small enough that the Baker-Campbell— 
Hausdorff formula applies to X and Y and to ¢(X) and $(Y). Then, using 
the linearity of the integral and reasoning similar to the above, we have 


ollog (e*e” )] = o(X) + f 5y amg | (ex efady _ )™ (¥)] dt 
m=0 


1 œ 

= o(X)+ I 5 am (e240) etade — T)™ ((Y)) dt 
0 m=0 

= log (eee) 


This is what we wanted to show. o 
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3.3 The Derivative of the Exponential Mapping 


Before coming to the proof of the Baker-Campbell—Hausdorff formula itself, 
we will obtain a result concerning derivatives of the exponential mapping. 
This result is valuable in its own right and will play a central role in our proof 
of the Baker-Campbell—Hausdorff formula. 

Observe that if X and Y commute, then 


exttY = X tY 


e 
and so 
d XHY = eX d Y — eX¥Y 
dt 45 dt D 
In general, X and Y do not commute, and 
d XHY Æ ery. 
dt t=0 


(However, see Exercise 4.) This, as it turns out, is an important point. In 
particular, note that in the language of multivariate calculus, 


d 
L XHY 


5 (3.8) 


_ J directional derivative of “exp” at X, 
=o lin the direction of Y ` 


Thus, computing the left-hand side of (3.8) is the same as computing all of 

the directional derivatives of the (matrix-valued) function “exp.” We expect 

the directional derivative to be a linear function of Y, for each fixed X. 
Now, the function 


is an entire analytic function of z, even at z = 0, and is given by the power 


series 
zk z2 


[0.0] 
s Tinerii 
FN "EFD at F 


This series (which has infinite radius of convergence) makes sense when z is 
replaced by a linear operator A on some finite-dimensional vector space. 


EZ 


Theorem 3.5 (Derivative of Exponential). Let X and Y be nxn complex 


matrices. Then, 
— e7adx 
aa 
t=0 adx 


ae fy At, KAL...) (3.9) 


d XHY 
Pr 
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More generally, if X (t) is a smooth matria-valued function, then 


d X(t) X(t) I- e8dx(t) dX 
— — — — y oL 
di s adx  \ dt ey) 


Note that the directional derivative in (3.9) is indeed linear in Y for each 
fixed X. Note also that (3.9) is just a special case of (3.10), by taking X(t) = 
X +tY and evaluating at t = 0. 

Furthermore, observe that if X and Y commute, then only the first term 
in the series (3.9) survives. In that case, we obtain f eX tHtY| ee e*Y, as 
expected. 

The formula for the derivative of the exponential mapping is well known. 
The proof here follows that of Tuynman [The Derivation of the Exponential 
Map of Matrices, Amer. Math. Monthly 102 (1995), 818-819]. 


Proof. I prove only form (3.9); then, (3.10) follows by elementary calculus. 
For any n x n matrices X and Y, set 


A(X,Y) = a xs) 
dt t=0 
I leave it as an exercise (Exercise 3) to show that exp : M,(C) > M,(C) is 
a continuously differentiable map. This implies that A(X,Y) is jointly con- 
tinuous in X and Y and that it is linear in Y for each fixed X (by a basic 
property of continuously differentiable functions of several variables). 
Now, for every positive integer m, we have 


erry lexw(=. Ar. (3.11) 


e n 
m m 


Thus, applying the product rule (extended to m factors), we will get m terms, 
in each of which m — 1 of the factors in (3.11) are simply evaluated at t = 0 
and the remaining factor is differentiated at t = 0. So, we get 


= SF ex(X) | d exo = +12) | (Z) 
= rr m dt m m) fizo m 
m-1 =k k 
CPER OEO 
m 7 m mm m 


0 
m-—1 ore adx \* X 
= exo =x) — exp -22 ) (a(2.¥)) . (3.12) 
m m m m 
k=0 
In the third equality, we have used the linearity of A(X,Y) in Y and the 
relationship between Ad and ad. 


The left-hand side of (3.12) is equal to the right-hand side for each fixed m 
and thus the left-hand side is equal to the limit as m — oo of the right-hand 
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side. Let us consider what happens as m — œ in the last line of (3.12). The 
factor in front tends to exp(X). Since A(X, Y) is jointly continuous in X and 
Y, the expression A(X/m, Y) tends to A(0, Y), where it is easily verified that 
A(0,Y) = Y. Thus, it remains only to analyze the behavior of 


This is taken care of in the following lemma. 


Lemma 3.6. For any n x n matrix X, we have 


m—-1 k —ad 
; 1 adx 1—e *x 
lim — -— | = ——_—_.. 3.13 
im exol ao ) (3.13) 


adx 


Once this lemma is established, we take the limit as m — oo everywhere 
and we are done. (Note that the quantities in (3.13) are linear operators on 
a finite-dimensional vector space, namely M,,(C), thus essentially n? x n? 
matrices. The operation of multiplying a n? x n? matrix by a n?-component 
vector is jointly continuous in the two variables. Thus, we are justified in 
separately evaluating the limit in (3.13) and the limit in A(X/m,Y).) We 
now turn to the proof of Lemma 3.6. 


Proof. Let us first reason at a formal level (i.e., pretending that adx is a 
nonzero number instead of an operator). Then, using the usual formula for 
the sum of a finite geometric series would give 


oo _adx a 1 1-—exp(-adx) PAR 1 — exp(—adx) 
m <~ on ~ m1—exp(—adx /m) mas adx ` 


To give a rigorous argument, we write exp(—adx /m)* as exp(—kadx /m) and 
compute 


m= foe) m-1 i 
1 kady = 1 1 kadx 
a exo( x) = yt i(- x) 


k=0 i=0 k=0 
7 oo i k i (-1} ; 
PDA 


(We have interchanged the finite sum over k with the infinite sum over i.) 
Now, we may recognize the quantity in square brackets in the last expression 
as the Riemann sum approximation to the integral i x'dz, where the value 
of the integral is 1/(i + 1). So, as m tends to infinity, the quantity in square 
brackets tends to 1/(i + 1). Furthermore, since the function z* is increasing 
on the interval [0, 1], the value of the expression in square brackets will be less 
than the value of the integral, for each m. 
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Now, each term in the series is a linear operator on M,,(C), which we can 
think of as an n? x n? matrix. The norm of each term (as n? x n? matrices) 
is bounded by 

1 1 

i+li! 

Now, each entry in an n? x n? matrix is smaller (in absolute value) than the 

norm of the matrix, as is easily verified. Thus since the sum of the quantities 

in (3.14) is finite, we can apply the dominated convergence theorem to each 

entry of the matrix-valued sum to justify interchanging the limit m — oo with 
the infinite sum over i. This gives 


adx. (3.14) 


m=1 lee) ; i = 
eee fol (—1)' (adx) toes 
l — — or SO > Ái 
mrem 2 a ae CU > G+ 1)! adx 
O 
This concludes the proof of Theorem 3.5. oO 


3.4 Proof of the Baker—Campbell—Hausdorff Formula 


We now turn to the proof of the Baker-Campbell—Hausdorff formula itself. 
Define 

Z(t) = log(e*e'”) 
If X and Y are sufficiently small, then Z(t) is defined for 0 < t < 1. It is left 


as an exercise to verify that Z(t) is smooth. Our goal is to compute Z(1). 


By definition, 
eZ(t) — e¥ eY 


so that d 
T S (ry eX eéYY =y 
On the other hand, by Theorem 3.5, 


2-2) doze _ [Loe ew | (dz) 
dt ad z(t) dt 


— ea adzit 
a} (=) ayy 
ad z(t) dt 


If X and Y are small enough, then Z(t) will also be small, so that [J — 
e72dz(e)] /adzt) will be close to the identity and thus invertible. So, 


dZ {I= eaz] 
EER rS O te g 3.15 
dt { ad z(t) \ (Y) ( ) 


Hence, 
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Recall that e(t) = e*etY. Applying the homomorphism “Ad” gives 
Ad.zit) = Adex Adgty. 
By the relationship between “Ad” and “ad” (Proposition 2.25), this becomes 
e2dz(t) — e?dx ptady 


or 


adz(t) = log(e*4* e'4¥) , 


Plugging this into (3.15) gives 


eS E a \ (Y). (3.16) 


dt log (edx etady ) 


-13-1 
se |e | 
so, formally, (3.16) is the same as 
dz 
dt 
It is not hard to show that this formal argument is actually correct. 


Now we are done, for if we note that Z(0) = X and integrate (3.17), we 
get 


Now, observe that 


g (e*4* e4¥) (Y). (3.17) 


1 
ZQl)=X +f g(e24x ed )(V) dt, 
0 


which is the Baker-Campbell—Hausdorff formula. 


3.5 The Series Form of the Baker-Campbell—Hausdorff 
Formula 


Let us see how to get the first few terms of the series form of Baker-Campbell-— 
Hausdorff from the integral form. Recall the function 


_ zlogz 
g(z) = i 

(i+ (2-0 [(@-1) - SP + SP -...] 
~ (2-1) 


=[1+(z-1)| h-i eos. 
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Multiplying this out and combining terms gives 


g(2)=145 (2-1) -i (2-1? te 
The closed-form expression for g is 
o (ym z 
g(@)=1+ a0 


Meanwhile, 


eax etady =F 


2 2 
= (Fads pid: +. o) (rrara PO a 


(adx)? J t? (ady)? 


TG 


= ady + tady + tady ady 4 


The crucial observation here is that e#¢* etadY — J has no zero-order term, just 
first order and higher in adx and ady. Thus, (e¢* e'@¢¥ — J ) will contribute 
only terms of degree m or higher in adx and/or ady. 

We have, then, up to degree 2 in ady and ady, 


(adx)? vs t (ady)? 


g (ex etr) = ay ee 5 [adx +tady +tadx ady + —~— 


1 
— gladx E a P ++ 


(adx)? y t (ady)? 
4 4 


1 
-3 [(adx)? a? (ady)? + tady ady + tady adx| 


1 t t 
=1I+%5əadx + 5 ady + 5 adx ady + 


+ higher-order terms. 
We now apply g (e*¢*e'*¢r) to Y and integrate. So (neglecting higher-order 


terms) using Baker-Campbell—Hausdorff and noting that any term with ady 
acting first is zero: 


log (e¥ e”) 
=x+ f P+; [X,Y] + XXY- Z [XXY] 5 YX Y| d 


n E e zxy (3 - PJY- f taw YN 


Thus, if we do the algebra, we end up with 
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log(eXe”) =X +Y + ; [X,Y] + 5 [X, [X, Y]] - 5 [Y, [X,Y] 


+ higher order terms. 


This is the expression in (3.2). 


3.6 Group Versus Lie Algebra Homomorphisms 


Recall Theorem 2.21, which says that given matrix Lie groups G and H anda 
Lie group homomorphism ® : G — H, there is a Lie algebra homomorphism 
@:g— h such that (exp X) = exp ¢(X) for all X € g. In this section, we 
prove a converse to this result in the case that G is simply connected. 


Theorem 3.7. Let G and H be matrix Lie groups with Lie algebras g and 
h. Let 6: g— h be a Lie algebra homomorphism. If G is simply connected, 
then there exists a unique Lie group homomorphism ® : G —> H such that 
(exp X) = exp(@(X)) for all X € g. 


This has the following corollary. 


Corollary 3.8. Suppose G and H are simply-connected matrix Lie groups 
with Lie algebras g and h. If g is isomorphic to b, then G is isomorphic to H. 


Proof. Let 6: g > § be a Lie algebra isomorphism. By Theorem 3.7, there 
exists an associated Lie group homomorphism ® : G > H. Since ¢71 : 5 > 
g is also a Lie algebra homomorphism, there is a corresponding Lie group 
homomorphism Y : H — G. We want to show that ® and Y are inverses of 
each other. 

However, the Lie algebra map associated with the composition is the com- 
position of the Lie algebra maps (Point 3 of Theorem 2.21), which is the 
identity. So, by Corollary 2.32, ® o Y = Ip. Similarly, Vo = Ig. o 


We now proceed with the proof of Theorem 3.7. 


Proof. Step 1: Define ® in a neighborhood of the identity. 

Corollary 2.29 says that the exponential mapping for G has a local inverse 
which maps a neighborhood V of the identity into the Lie algebra g. If we 
make V small enough, then we can also assume that for all A, B € V, we have 
log A and log B small enough that the Baker-Campbell-Hausdorff theorem 
applies to them. We fix one such neighborhood V for the remainder of the 
proof. 

On this neighborhood V, we can define ®: V — H by 


(A) = exp{¢(log A)}; 


that is, on V, we have 
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® = exp o¢ o0 log. 


(Note that if there is to be a homomorphism ® as in the theorem, then ® 
must be given by this formula.) 


Step 2: Define ® along a path. 

Recall that part of what it means for G to be simply connected is that it is 
connected. Recall also that when we say G is connected, we really mean that 
G is path-connected. (By now, we know that G is an embedded submanifold 
of GL(n;C). This means that G is locally path-connected and, thus, that G 
is connected if and only it is path-connected.) Thus, for any A € G, there 
exists a path A(t) € G with A(0) = J and A(1) = A. A standard argument 
using the compactness of the interval [0,1] shows that there exists numbers 
0 = to < tı <tg-:: < tm = 1 such that for all s and t satisfying t; < s < t < 

ti+ı (for some i), we have 


A(t)A(s)7' € V. (3.18) 


In particular, since to = 0 and A(0) = J, we have A(tı) € V. We now 
write A = A(1) in the form 


A = [A(1)A(tm—1)~*] [A(tm—1)A(tm—2)7*] +» [A(te) A(t) AG). 


Since ® is supposed to be a homomorphism, it is reasonable to “define” ®(A) 
by 
(A) = ®(A(1)A(tm—1)7*) + ®(A(t2) A)" ) (AC), (3.19) 


where each factor on the right is defined as in Step 1. 


Step 3: Prove independence of the partition. 

For this definition of ®(A) to be valid, we must show that the value of 
(A) is independent of the choice of the path and independent of the choice 
of partition (to,...,tm) for a given path. We address independence of the 
partition first. It is in this step (and only in this step) that we use the Baker- 
Campbell-Hausdorff theorem. To establish independence of partition, we first 
show that passing from a particular partition to a refinement of that partition 
does not change the result. (A refinement of a partition is one which contains 
all the points of the original partition, together with some other ones.) Note 
that if a given partition satisfies the condition (3.18), then any refinement of 
that partition also satisfies this condition. 

Suppose, now, that we insert an extra partition point s between t; and 
ti+1. Then, the factor ®(A(t;41)A(t;)~!) in (3.19) will be replaced by 


®(A(ti+1)A(s)~*)®(A(s) A(t)". 


Since s is between t; and tj41, the condition (3.18) on the original partition 
guarantees that A(t;,,)A(s)~! and A(s)A(t;)~', in addition to A(ti41)A(ti)~*, 
are all in V. Now, it follows from Corollary 3.4 to the Baker-Campbell-— 
Hausdorff formula that ©, as defined in Step 1, is a “local homomorphism”; 
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that is, ®(AB) = 6(A)®(B) for all A and B sufficiently close to the identity. 
(When applying the corollary, write A as e* and B as eY.) This means that 


B(A(ti41)A(ts)~*) = ®(A(ti41) A(s) ®(A(s) A(t) >) 


and, thus, the value of ®(A) is unchanged by the addition of the extra partition 
point. By repeating this argument, we see that the value of ®(A) does not 
change by the addition of any finite number of points to the partition. 

Now, given any two partitions, they have a common refinement, namely 
their union. The above argument shows that the value of ®(A) computed from 
the first partition is the same as for the common refinement, which is the same 
as for the second partition. This shows independence of the partition. 


Step 4: Prove independence of the path. 

Having proved that the value of ®(A) is independent of the partition for 
a fixed path, we now need to prove that ®(A) is independent of the choice of 
path. It is in this step that we use the simple connectedness of G. Suppose 
Ao(t) and A(t) are two paths joining the identity to some A € G. Then, 
since G is simply connected, a standard topological argument shows that Ao 
and A, are homotopic with endpoints fixed. This means that there exists a 
continuous map A : [0,1] x [0,1] => G with 


A(0,t) = Ao(t), A(1,t) = Ai(t) 
for all t € [0, 1] and also 
A(s,0) =I, A(s,1)=A 


for all s € [0,1]. 

The compactness of [0,1] x [0,1] guarantees that there exists an integer 
N such that for all (s,t) and (s’,t’) in [0,1] x [0,1] with |s—s’| < 2/N 
and |t — t'| < 2/N, we have A(s,t)A(s’, t’)! € V. We now employ a standard 
topological trick to deform Ag “a little bit at a time” into A,. This means that 
we define a sequence B, of paths, with k = 0,...,N — 1 and l = 0,..., N. 
We define these paths so that Bp ı(t) coincides with A((k + 1)/N,t) for t 
between 0 and (l — 1)/N, and By. (t) coincides with A(k/N,t) for t between 
L/N and 1. For t between (J—1)/N and l/N, we define B, 1(t) to coincide with 
the values of A(-,-) on the path that goes “diagonally” in the (s, t)-plane, as 
indicated in Figure 3.1. (I could write the formula for B;, in this interval, 
but the picture is clearer than the formula.) When computing Bko, there are 
no t-values between 0 and (J — 1)/N, so Bk o(t) = A(k/n,t) for all t € [0,1]. 
In particular, Bo o(t) = Ao(t). 

We think of deforming the path Ap into A, in steps. First, we deform 
Ao = Bo o into Bo,ı and then into Bo.2, Bo,3, and so on until we reach Bo,n, 
which we then deform into B1, and then into By 1,...,B1,~. We continue 
this process until we reach By_j,n, which we finally deform into A;.We want 
to show that the value of ®(A) computed along each of these paths is the 
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k k+l 1 S 
N N 


Fig. 3.1. The path Bk, 


same as the value of ®(A) computed along the next one. Now, we note that 
for k < l, Bkı(t) and Bk ı+1(t) are the same except for t’s in the interval 
[0 — 1)/N, (L+ 1)/N]. We then exploit the independence of the partition that 
we have just verified. We may choose any partition we like, provided that 
the condition (3.18) is satisfied. So, for both Bg „ and Bri, we choose the 
partition points to be 
1 1-1 141142 
ai ap ae 
The way we have chosen N guarantees that this is a valid partition. (Check!) 
Now, note (from (3.19)) that the value of (A) depends only on the values 
of the path at the partition points. We have chosen our partition in such a 
way that the values of Bp, and Bk ı+1 are identical at all the partition points, 
and, therefore, the value of ®(A) is the same for these two paths. A similar 
argument shows that the value of ®(A) computed along Bk, y is the same as 
along By+1,0. (Note that Bk,n(1) = Breiio(1) = A.) Thus, the value of (A) 
is the same for each path from Ap = Bo, all the way to By_—1,n and then (by 
the same argument) the same as A. This shows independence of the path. 


0, 


Step 5: Prove that ® is a homomorphism and is properly related to 9. 
The proof that ® is a homomorphism is fairly straightforward and is left 
to the reader. See Exercise 10. It then remains only to verify that ® has the 
proper relationship to œ. However, since ® is defined near the identity to be 
® = exp o¢ o log, we see that 
d 


2 a tX 
aoe | 


d 
— 2 tox) 


t=0 


=¢(X). 


t=0 


Thus, ¢ is the Lie algebra homomorphism associated to the Lie group homo- 
morphism ©. 
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This completes the proof of Theorem 3.7. o 


We will return to the issue of the relationship between Lie group and Lie 
algebra homomorphisms in Section 4.9 of the next chapter. 


3.7 Covering Groups 


Theorem 3.7 says that if G is simply connected, then every Lie algebra ho- 
momorphism for G can be exponentiated to give a Lie group homomorphism. 
If G is not simply connected, this will not (in general) be true. It is thus rea- 
sonable to look for another group G that has the same Lie algebra as G but 
such that G is simply connected. Such a group is called the universal covering 
group (or just the universal cover) of G. 


Definition 3.9. Let G be a connected Lie group. Then, a universal cover- 
ing group (or universal cover) of G is a simply-connected Lie group H 
together with a Lie group homomorphism ® : H — G such that the associ- 
ated Lie algebra homomorphism ¢:h — g is a Lie algebra isomorphism. The 
homomorphism ® is called the covering homomorphism (or projection 
map). 


Here neither G nor H is assumed to be a matrix Lie group. As discussed 
later, the universal cover of a matrix Lie group may not be a matrix Lie group. 
For every connected Lie group, a universal cover exists and is unique up to 
“canonical isomorphism,” as explained in the following theorem. 


Theorem 3.10. For any connected Lie group, a universal cover exists. If G 
is a connected Lie group and (H,,®,) and (H2, ®2) are universal covers of G, 
then there exists a Lie group isomorphism Y : Hı — Hə such that ®2oV = ®ı. 


Appendix C gives a sketch of the proof of this result. The uniqueness part 
of the result is a consequence of Theorem 3.7 (Exercise 14). 

Since the universal cover of a connected Lie group G is unique (up to 
canonical isomorphism), it is reasonable to speak of the universal cover (G, ®) 
of G. Furthermore, if G is a simply-connected Lie group and ¢ : g > g is 
a Lie algebra isomorphism, then by Theorem 3.7 (which actually applies to 
all Lie groups, not just matrix Lie groups), we can construct an associated 
Lie group homomorphism © : Č > G. Then (G,®) is a universal cover of G. 
Since ¢ is an isomorphism, we can use ¢ to identify g with g. Thus, in slightly 
less formal terms, we may define the notion of universal cover as follows: The 
universal cover of a Lie group G is the unique simply-connected group G such 
that the Lie algebra of G is equal to the Lie algebra of G. (Implicit in this 
form of the definition is that we have chosen some particular isomorphism 
@:§ —> g to identify g with g.) If we adopt this form of the definition, then 
the covering homomorphism is defined as the unique Lie group homomorphism 
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®:G—>G such that the associated Lie algebra map ¢ : g —> g is the identity. 
(The existence of ® is by Theorem 3.7.) 

The study of universal covering groups is one of the places where we pay a 
price for our decision to consider only matrix Lie groups: the universal cover 
of a matrix Lie group may not be a matrix Lie group. That is, even if G is a 
matrix Lie group, the universal cover G of G may not be isomorphic to any 
matrix Lie group. For example, the universal cover of SL(n; R) (n > 2) is not 
a matrix Lie group. See Section C.3. 

One can also consider covering groups that are not universal covers. A 
covering group of a connected Lie group G is a connected Lie group H 
(not necessarily simply connected) together with a Lie group homomorphism 
® : H — G such that the associated Lie algebra homomorphism ¢ : ) > g 
is an isomorphism. There may be several nonisomorphic covering groups of 
a given group G, and these different covers may have different fundamental 
groups. 

Let us now consider some examples of universal covers. 


Example 1: G = St. In this case the universal cover is R and the covering 
homomorphism is the map ®: R - S! given by 0 > e”. 

Example 2: G = SO(3). In this case, the universal cover is SU(2) and 
the covering homomorphism is the map ©® described in Section 1.6. (See also 
Section 4.9.) 


Example 3: G = U(n). In this case, the universal cover is R x SU(n) and 
the covering homomorphism is the map ® : RxSU(n) > U(n) given by 


(6,U) = e”? U. (3.20) 


Note that since both R and SU(n) are simply connected (Appendix E), R x 
SU(n) is simply connected. It is straightforward to check (Exercise 15) that 
the Lie algebra map associated to ® is indeed a Lie algebra isomorphism in 
this case. 


Example 4 G = SO(n). For n > 3, the universal cover of SO(n) is a 
double cover (i.e., the projection map ® is two-to-one). This reflects that 
the fundamental group of SO(n) (n > 3) has two elements. The universal 
cover of SO(n) is called Spin(n) and may be constructed as 4 certain group 
of invertible elements in the Clifford algebra over R”. See Brécker and tom 
Dieck (1985), Chapter I, Section 6, especially Propositions 1.6.17 and 1.6.19. 
In particular, Spin(n) is a matrix Lie group. The cases n = 3 and n = 4 are 
special. For n = 3, we have (Example 2) Spin(3) S SU(2) and for n = 4 we 
have Spin(4) = SU(2) x SU(2). 


In all of these examples, the universal cover turns out to be, again, a 
matrix Lie group. More generally, it is possible to show that the universal 
cover of a compact matrix Lie group is always, again, a matrix Lie group (not 
necessarily compact). 
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3.8 Subgroups and Subalgebras 


Suppose that G is a matrix Lie group, that H is another matrix Lie group, 
and that H C G. Then, certainly, the Lie algebra h of H will be a subalgebra 
of the Lie algebra g of G. Does this go the other way around? That is, given 
a matrix Lie group G with Lie algebra g and a subalgebra h of g, is there a 
matrix Lie group H whose Lie algebra is b? 

In the case of the Heisenberg group, the answer is yes. This holds because 
for the Heisenberg group, the exponential mapping is one-to-one and onto and 
the Baker-Campbell-Hausdorff formula takes a particularly simple form. (See 
Exercise 16.) 

In general, however, there may not be any matrix Lie group H correspond- 
ing to a given subalgebra h. For example, let G = GL (2; C) and let 


v= (5 malter} (3.21) 


where a is irrational. This is a one-dimensional real subalgebra of g = gl(2;C). 
If there were going to be a matrix Lie group H with Lie algebra b, then H 
would contain the set of all exponentials of elements of h, namely 


Ho = { & oa ) | te R}. (3.22) 


To be a matrix Lie group, H would have to be closed in GL (2; C), and so it 
would contain the closure of Ho, which (Exercise 1 in Chapter 1) is the set 


e* 0 
m=1(5 k) ster}. 


However, then, the Lie algebra of H would have to contain the Lie algebra of 
Hı, which is two dimensional! 

Fortunately, all is not lost. We can still get a subgroup H for each sub- 
algebra h if we weaken the condition that H be a matrix Lie group. In the 
above example, the subgroup we want is Hp, even though Hp is not a matrix 
Lie group. 


Definition 3.11. If H is any subgroup of GL(n;C), define the Lie algebra h 
of H to be the set of all matrices X such that 


eX cH 
for all real t. 


Definition 3.12. If G is a matriz Lie group with Lie algebra g, then H CG 
is a connected Lie subgroup of G if the following conditions are satisfied: 


1. H is a subgroup of G. 
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2. The Lie algebra h of H is a subspace of g. 
3. Every element of H can be written in the form e*1e*2---e*m, with 
Rie, Xm ED. 


Connected Lie subgroups are also called analytic subgroups. The group 
Ho in (3.22) is a connected Lie subgroup of GL(2;C) whose Lie algebra is 
the algebra h in (3.21). The word “connected” in the phrase “connected Lie 
subgroup” is justified by the following easy result. 


Proposition 3.13. If G is a matriz Lie group and H is a connected Lie 
subgroup of G, then H is path-connected. That is, any two points in H can be 
connected by a continuous path lying in H. 


Proof. As usual, it suffices to show that any element of H can be connected 
to the identity by a continuous path lying in H. If h € H then we write 


h = e*1¢e%2... eX, Xk ED, 
as in Condition 3 in the definition. We consider the path h(t) given by 


h(t) = he t*™ = ext ex2 eis et) Xm | 


This path is continuous and lies in H, since (by the definition of h) e7tXm 
lies in H for all t. As t varies from 0 to 1, h(t) connects the element h to 
the element e*!e*? ---e*-1 of H. By applying this process m times, we can 


connect h to the identity. o 


Proposition 3.14. If G is a matrix Lie group with Lie algebra g and H is a 
connected Lie subgroup of G, then the Lie algebra h of H is a subalgebra of g. 


Proof. If A € H and Y € b, then exp(tAY A~!) = Aexp(tY)AT! belongs to 
H for all real t. Thus, AY A`! is, again, in b. Then, as in the proof of Point 
3 of Theorem 2.18, if X and Y are in h we have e'*Ye~™ in h for all t. 
Therefore, since h is a vector space and (thus) a topologically closed subset 
of Mn (C), we have 


o 


We are now ready to state the main result of this section, which is our 
second major application of the Baker-Campbell-Hausdorff formula. 
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Theorem 3.15. Let G be a matrix Lie group with Lie algebra g. Let h be a 
Lie subalgebra of g. Then, there exists a unique connected Lie subgroup H of 
G such that the Lie algebra of H is h. The subgroup H consists precisely of 


elements of the form 


XieX2... eXm 


e =e 


with Xy,...,Xm Eb. 


The proof of this result is given at the end of this section. 

Given a matrix Lie group G and a subalgebra h of g, the associated con- 
nected Lie subgroup H might be a matrix Lie group. This will happen precisely 
if H is a closed subset of G. There are various conditions under which it can 
be proved that H is closed. For example, if G = GL (n; C) and is semisimple 
(Chapter 6), then H is automatically closed, and hence a matrix Lie group. 
(See Helgason (1978), Chapter II, Exercises and Further Results, D.) 

If the Baker-Campbell—Hausdorff formula worked globally instead of only 
locally, the proof of this theorem would be easy. If the Baker-Campbell— 
Hausdorff formula converged for all X and Y, we could just define H to be the 
image of h under the exponential mapping. In that case, the Baker-Campbell-— 
Hausdorff formula would show that this image is a subgroup, since, then, we 
would have ete? = ef, with Z = Hı + Ho + 4 [H1, Hə] +++- € b, pro- 
vided that Hı, H2 € h and that þ is a subalgebra. Unfortunately, the Baker- 
Campbell-Hausdorff formula is not convergent in general, and, in general, the 
image of h under the exponential mapping is not a subgroup. 


Proposition 3.16. Suppose that G is a matrix Lie group with Lie algebra g 
and suppose that h is a subalgebra of g. Suppose that F is a connected matrix 
Lie group with Lie algebra f and that ® : F > G is a Lie group homomorphism 
with the property that o(f) = h. (Here, @ is the Lie algebra homomorphism 
associated to ®.) Then, the connected Lie subgroup of G with Lie algebra h is 
equal to ®(F) (the image of F under ®). 


Proof. Let H be the connected Lie subgroup of G with Lie algebra h. Since F 
is connected, every element A of F can be written as A = exp X1---exp Xm, 
Xx € f. So, every element of ®(F’) can be written as exp ¢(X1)--- exp ¢(Xm), 
where, by assumption, ¢(X,) € b. This shows that ®(F) c H. Con- 


versely, every element B of H can be written as B = expY,---expYm, 
with Yẹ € h = ¢(f). Choosing X;’s in f with ¢(X,%) = Yk, we have that 
B = (exp X;)--- (exp Xm). This shows that H Cc ®(F). o 


If H is a connected Lie subgroup of a matrix Lie group G, then the topology 
that H inherits as a subset of G may be quite pathological (e.g., not locally 
connected). However, we can define a different topology on H that is much 
nicer. For any A € H and any e > 0, define 


Uae = {Ae*| X € h and ||X|| <e}. 
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Now define a topology on H as follows: A set U C H is open if for each A € U 
there exists e > 0 such that U4, C U. In this topology, two elements A and 
B of H are “close” if we can express B as B = Aexp X with X € b and ||X|| 
small. This topology is finer than the topology H inherits from G; that is, if 
A,B € H are close in the usual sense in G, then they are close in this new 
topology on H, but not vice versa. 

If H is a connected Lie subgroup of G, then it can be shown that in this 
new topology, H is a topological manifold. Furthermore, H can be made into a 
smooth manifold by using the sets U4 « as our basic coordinate neighborhoods 
and using the quantity X in the expression A exp X as our local coordinate. 
The product and inverse maps on H are smooth with respect to this smooth 
manifold structure, and so H can in this way be made into a Lie group. 

We summarize these conclusions in the following theorem. It is not hard 
to prove this result by elaborating on the discussion in the previous two para- 
graphs. Compare the section “Lie Subgroups” in Chapter 3 of Warner (1983). 


Theorem 3.17. Suppose that G is a matrix Lie group and H a connected Lie 
subgroup of G. Then H can be given the structure of a Lie group in such a 
way that the inclusion of H into G is a Lie group homomorphism. 


Once H has been made into a Lie group, it has a Lie algebra in the sense of 
Appendix C. This Lie algebra is naturally isomorphic to the subalgebra h we 
began with. Thus, Theorem 3.17 and Ado’s Theorem (Theorem 2.40, which 
we have not proved) imply the following result. 


Theorem 3.18. Every finite-dimensional real Lie algebra is isomorphic to the 
Lie algebra of some Lie group. 


We now turn to the proof of Theorem 3.15. 


Proof. Since G is assumed to be a matrix Lie group, we may as well assume 
that G = GL(n;C) so that g = gl(n;C). (After all, if G is a closed subgroup 
of GL(n;C) and H is a connected Lie subgroup of GL(n; C) whose Lie algebra 
h is contained in H, then H is also a connected Lie subgroup of G.) As in 
the proof of Theorem 2.27, we think of gl(n;C) as R2”” and we decompose 
gl(n; C) as the direct sum of h and D, where D is the orthogonal complement 
of h with respect to the usual inner product on R2”, Then, as shown in the 
proof of Theorem 2.27, there exists neighborhoods U and V of the origin in h 
and D and a neighborhood W of I in GL(n; C) such that each A € W can be 
written uniquely as 

A=e*eY (3.23) 


with X € U, Y € V, and such that X and Y depend continuously on A. Now, 
define 
E={YeV|e’ €H}. 


Lemma 3.19. The set E is at most countable. 
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Let us assume this result for the moment and continue with the proof of 
the theorem. Define H to be the set of elements A € GL(n;C) that can be 
expressed in the form exp X;---exp Xm for some finite collection X1,...,Xm 
of elements of h. This set is, by definition, closed under multiplication. It 
is closed under inverses since the inverse of exp X is exp(—X). So, H is a 
subgroup of GL(n; C). Furthermore, H satisfies, by its definition, Condition 3 
in the definition of connected Lie subgroups. Thus, it remains only to show 
that the Lie algebra of H is b. 

Let h’ be the Lie algebra of H. Clearly, b’ D h, so it remains to show that 
h’ C h. Suppose Z is an element of h’. Then, as in (3.23), we may write, for 
all sufficiently small t, 

eZ = eX D eY ©, 

where X(t) € U C h and Y(t) € V C D and where X(t) and Y(t) are 
continuous functions of t. Now, both exp(tZ) and exp X(t) belong to H, and 
since H is a subgroup, we conclude that exp Y(t) must also belong to H. This 
means that Y(t) belongs to the set E for all sufficiently small t. If Y(t) were 
not constant, then it would take on uncountably many values, which would 
mean that E is uncountable, violating Lemma 3.19. So, Y (t) must be constant, 
and since Y(0) = 0, this means that Y(t) is identically equal to zero. Thus, 
for small t, we have exp(tZ) = exp X(t) and, therefore, tZ = X(t) € b. This 
means Z € h and we conclude that b’ C b. 

So, it remains only to prove Lemma 3.19. (This proof is the only place we 
use that þh is a subalgebra of gl(n;C) and not just a subspace.) Before doing 
this, we prove another lemma. 


Lemma 3.20. Pick a basis for h and call an element R of h rational if its 
coefficients with respect to this basis are rational. Then, for every 6 > 0 and 
every A € H, there exist rational elements Ri,...,Rm of b such that 


A =ebiehz ... em eX 
where X is inh and ||X|| < 6. 


Proof. Choose € > 0 small enough that the Baker-Campbell—Hausdorff for- 
mula applies for all X and Y with ||X|| < £ and ||Y || < £. Let C(X, Y) denote 
the quantity on right-hand side of the Baker-Campbell-Hausdorff formula, so 
that C(X, Y) satisfies 
eX eY = eC% Y) 

whenever || X|], ||Y|| < £. It is not hard to see that the function C(X,Y) is 
continuous. 

Now, choose €’ > 0 small enough that ||C(X,Y)|| < £ for all X and Y 
with ||X|| < e and ||Y|| < e’. Since exp X = (exp(X/n))", every element A 
of H can be written in the form 


A=e*!...e%m (3.24) 
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for some sequence Xj,...,Xm in b with ||X,|| < e’, k= 1,...,m. 

Now, because h is a subalgebra of gl(n;C), C(X1, X2) will be, again, an 
element of h, since the operators ady, and adx, on the right-hand side of the 
Baker-Campbell—Hausdorff formula preserve h. Choose a rational element Rı 
of h that is very close to C(X1, X2) and that satisfies ||Ri|| < £. (This is 
possible because X, and Xə have norm less than <’ and, thus, C(X, X2) has 
norm less than £.) Then we have 

e%1eX2 — eC(X1,X2) 


— ePi e7 Ri eC(X1:X2) 


efi eX2, 
where Xə = C(—R,, C(X1, X2)). Now, C(-,-) is continuous and 
C(—C(X1, X2), C(X1, X2)) = —C(X1, X2) + C(X1, X2) = 0, 


since C(X1, X2) commutes with itself. Thus, if we choose Rı sufficiently close 
to C(X1, X2), we will have || X9|| < 2’. 
We see, then, that (3.24) may be rewritten as 


A = efi eX2eXs... gXm_ 
where R is rational and Xo (like X2) has norm less than e’. Applying the 
same argument to X2 and X3 we obtain 


A= eF eP2eX3 eX naieiee tA, 


ʻe 
Continuing on in the same way we eventually obtain 


A= eP! eke aig ePm- em 


with Rı,..., Rm-1 rational. If, at the very last stage, we choose Rm-—1 so that 
||Xm_—1|| <6, we have expressed A in the desired form. o 


We now supply the proof of Lemma 3.19. 


Proof. Fix ô so that for all X and Y with ||X||, |Y || < 6 the quantity C(X, Y) 
(the right-hand side of the Baker-Campbell-Hausdorff formula) is well defined 
and contained in U. Then, I claim that for each sequence Rı,..., Rm of ra- 
tional elements in §, there is at most one X € § with ||X|| < 6 such that the 
element 

eft eft2 ... ePmeX (3.25) 


belongs to exp V. After all, if we have 
eft pa... eftmeX1 — e1, 3.26) 


eie”? ... ePme¥2 = eY (3.27) 
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with Y1, Yo € V, then 


er? = ete Xie% = eYteC( XiX) 

with C(—X1, X2) € U. However, each element of exp V expU has a unique 
representation as eY e* with X € U and Y € V, so we must have Yz = Y; and 
and so (by (3.26) and (3.27)) e*: = eX?, which implies that Xı = Xo, since 
exp is injective on U. 

By Lemma 3.20, every element of H can be expressed in the form (3.25) 
with ||X|| < 6. Now, there are only countably many rational elements in h and 
thus only countably many expressions of the form e”! ---e®", each of which 
produces at most one element of the form (3.25) that belongs to exp V. Thus, 
the set E in Lemma 3.19 is at most countable. o 


This completes the proof of Theorem 3.15. o 


3.9 Exercises 


1. The center of a Lie algebra g is defined to be the set of all X € g such 
that [X,Y] = 0 for all Y € g. Now, consider the Heisenberg group 


with Lie algebra 


0a 
b= 007 |ja, b, y ER 
000 


Determine the center Z(b) of h. For any X,Y € b, show that [X,Y] € 
Z(b). Note that this implies, in particular, that both X and Y commute 
with their commutator. 
Show by direct computation that for any X,Y € b, 
1 
ere’ = ex t¥+alX¥] 
2. Let X be a linear transformation on a finite-dimensional real or complex 
vector space. Show that 
I—e* 
X 


is invertible if and only none of the eigenvalues of X (over C) is of the 
form 2rin, with n a nonzero integer. 


10. 
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Remark: This exercise, combined with the formula in Theorem 3.5, gives 
the following result (in the language of differentiable manifolds): The ex- 
ponential mapping exp : g —> G is a local diffeomorphism near X € g 
if and only adx : g > g has no eigenvalue of the form 27in, with n a 
nonzero integer. 


. Show that for any n x n matrices X and Y, 


[g (X+ |< mX” Y. 


t=0 


Using this, show that the map exp : Ma (C) —> M,(C) is continuously 
differentiable. 

Hint: Since we know that the series for the exponential mapping converges 
uniformly on sets of the form {X | || X|| < R}, it suffices to show that the 
series of term-by-term directional derivatives also converges uniformly on 
such sets. (Compare Theorem 7.17 in Rudin (1976).) 


. Show that for any X and Y in M, (C), even if X and Y do not commute, 


E (e N = trace (e*Y) : 
dt =, 


. Verify that the right-hand side of the Baker-Campbell—Hausdorff formula 


(3.6) reduces to X +Y in the case that X and Y commute. 


. Compute log (eX eY) through third order in X and Y by using the power 


series for the exponential and the logarithm. Show this gives the same 
answer as the Baker-Campbell-Hausdorff formula. 


. Using the techniques in Section 3.5, compute the series form of the Baker- 


Campbell-Hausdorff formula up through fourth-order brackets. (We have 
already computed up through third-order brackets.) 


. Suppose that X and Y are upper triangular matrices with zeros on the 


diagonal. Show that the power series for log(exp X exp Y) is convergent. 
What happens to the series form of the Baker-Campbell—Hausdorff for- 
mula in this case? 


. Give an example of matrices X and Y in sl(2;R) such that there does 


not exist any Z in sl(2;R) with exp X expY = exp Z. Use Exercise 30 of 
Chapter 2. What does this say about the result of applying the Baker- 
Campbell—Hausdorff formula to X and Y? 
Complete Step 5 in the proof of Theorem 3.7 by showing that ® as defined 
in Steps 1 through 4 is a homomorphism. Given A,B € G, choose a 
path A(t) connecting I to A and a path B(t) connecting J to B. Then, 
define a path C by setting C(t) = A(2t) for 0 < t < 1/2 and setting 
C(t) = A- B(2t — 1) for 1/2 < t < 1. (Thus, C connects I to AB.) If 
to,.--,¢tm is a valid partition for A(t) and soọ,...,sm is a valid partition 
for B(t), show that 

s 
+ 


to tm $0 1 
2 2° 2 


1 
3 7 9°9 


11. 


12: 


13. 


14. 


15. 
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is a valid partition for C(t). Now, compute ®(A), 6(B), and ®(AB) using 
these paths and partitions and show that (AB) = ®(A)®(B). 

If G is a universal cover of a connected group G with projection map ®, 
show that ® maps G onto G. 

Suppose that G is a connected matrix Lie group and that G is the universal 
cover of G. Show that G is isomorphic to G/N, where N is a discrete 
subgroup of the center of G. Use Exercise 11 from Chapter 1. 

Suppose that G is a matrix Lie group with Lie algebra g. Suppose that 
@ : sl(n;R) > g is a Lie algebra homomorphism. Show that there exists 
a Lie group homomorphism ® : SL(n;R) —> G such that ®(exp X) = 
exp ¢(X) for all X € sl(n;R). This is true even though SL(n;R) is not 
simply connected. 

Hint: Use the fact that SL(n;C) is simply connected. 

Note: The result of this problem is false if G is assumed merely to be a 
Lie group and not a matrix Lie group. 

Prove the uniqueness portion of Theorem 3.10. Use the fact that Theo- 
rem 3.7 (and basic results from Chapter 2) continue to hold for all (not 
necessarily matrix) Lie groups. 

Show that the Lie algebra homomorphism associated to the group homo- 
morphism ® in (3.20) is a Lie algebra isomorphism. (Here the Lie algebra 
of R is identified simply with R.) 

Let a be a subalgebra of the Lie algebra of the Heisenberg group. Show 
that exp(a) is a connected Lie subgroup of the Heisenberg group. 

Show that every connected Lie subgroup of SU(2) is closed. Show that 
this is not the case for SU(3). 

Let G be a matrix Lie group with Lie algebra g, let h be a subalgebra of g, 
and let H be the unique connected Lie subgroup of G with Lie algebra h. 
Suppose that there exists a compact simply-connected matrix Lie group 
K such that the Lie algebra of K is isomorphic to h. Show that H is 
closed. Is H necessarily isomorphic to K? 


A 


Basic Representation Theory 


4.1 Representations 


Definition 4.1. Let G be a matriz Lie group. Then, a finite-dimensional 
complex representation of G is a Lie group homomorphism 


Il: G + GL(n;C) 
(n > 1) or, more generally, a Lie group homomorphism 
Il: G > GL(V), 


where V is a finite-dimensional complex vector space (with dim(V) > 1). A 
finite-dimensional real representation of G is a Lie group homomor- 
phism II of G into GL(n;R) or into GL(V), where V is a finite-dimensional 
real vector space. 

If g is a real or complex Lie algebra, then a finite-dimensional complex 
representation of g is a Lie algebra homomorphism x of g into gi(n;C) or 
into gl(V), where V is a finite-dimensional complex vector space. If g is a real 
Lie algebra, then a finite-dimensional real representation of g is a Lie 
algebra homomorphism r of g into gl(n;R) or into gl(V). 

If Tl or x is a one-to-one homomorphism, then the representation is called 
faithful. 


One should think of a representation as a linear action of a group or 
Lie algebra on a vector space (since, say, to every g € G, there is associated 
an operator II(g), which acts on the vector space V). In fact, we will use 
terminology such as “Let II be a representation of G acting on the space V.” 
Even if g is a real Lie algebra, we will consider mainly complex representations 
of g. After making a few more definitions, we will discuss the question of why 
one should be interested in studying representations. 


Definition 4.2. Let II be a finite-dimensional real or complex representation 
of a matriz Lie group G, acting on a space V. A subspace W of V is called 
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invariant if II(A)w € W for allw € W and all A € G. An invariant subspace 
W is called nontrivial if W # {0} and W + V. A representation with no 
nontrivial invariant subspaces is called irreducible. 

The terms invariant, nontrivial, and irreducible are defined analo- 
gously for representations of Lie algebras. 


Definition 4.3. Let G be a matrix Lie group, let II be a representation of 
G acting on the space V, and let © be a representation of G acting on the 
space W. A linear map ¢ : V —> W is called an intertwining map of 
representations if 

(Aw) = X(A)e(v) 
for all A € G and all v € V. The analogous property defines intertwining 
maps of representations of a Lie algebra. 

If ¢ is an intertwining map of representations and, in addition, ġ is invert- 
ible, then @ is said to be an equivalence of representations. If there exists 
an isomorphism between V and W, then the representations are said to be 
equivalent. 


Two equivalent representations should be regarded as being “the same” 
representation. A typical problem in representation theory is to determine, 
up to equivalence, all of the irreducible representations of a particular group 
or Lie algebra. In Section 4.4, we will determine all the finite-dimensional 
complex irreducible representations of the Lie algebra su(2). 


Proposition 4.4. Let G be a matrix Lie group with Lie algebra g and let II 
be a (finite-dimensional real or complex) representation of G, acting on the 
space V. Then, there is a unique representation n of g acting on the same 
space such that 

II(e*) = et (X) 


for all X € g. The representation t can be computed as 


n(X) = ul (e’*) = 


and satisfies 
T (AX A7') = TI(A)n(X)IM(A)~! 


forall X Eg andall AEG. 


Proof. Theorem 2.21 states that for each Lie group homomorphism ® : G > 
H, there is an associated Lie algebra homomorphism ¢ : g —> b. Take 
H = GL(V) and ® = IL. Since the Lie algebra of GL(V) is gl(V) (since the 
exponential of any operator is invertible), the associated Lie algebra homo- 
morphism ¢ = m maps from g to gl(V) and, so, constitutes a representation 
of g. 
The properties of 7 follow from the properties of ¢ given in Theorem 2.21. 
Oo 
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Proposition 4.5. 


1. Let G be a connected matrix Lie group with Lie algebra g. Let II be a 
representation of G and n the associated representation of g. Then, II is 
irreducible if and only if x is irreducible. 

2. Let G be a connected matrix Lie group, let Tl; and I> be representations of 
G, and let mı and m2 be the associated Lie algebra representations. Then, 
Tı and tə are equivalent if and only if Il; and Iz are equivalent. 


Proof. For Point 1, suppose first that II is irreducible. We then want to show 
that m is irreducible. So, let W be a subspace of V that is invariant under 
m(X) for all X € g. We want to show that W is either {0} or V. Now, 
suppose A is an element of G. Since G is assumed connected, Corollary 2.31 
tells us that A can be written as A = e*!---e*™ for some Xj,...,Xm in g. 
Since W is invariant under 7(X;) it will also be invariant under exp(7(X;)) = 
I+ 7(X;) + 7(X;)?/2 +--+ and, hence, under 


TI(A) = II (e*! ---e%™) = I (e*?) --- 1 (e*™) 
= eT)... eT Xm). 

Since IJ is irreducible and W is invariant under each II(A), W must be either 

{0} or V. This shows that ~ is irreducible. 

In the other direction, assume that m is irreducible and that W is an 
invariant subspace for II. Then, W is invariant under H(exptX) for all X € g 
and, hence, under 

d 
X) = <I (e* 
m(X) = z(e") i 
Thus, since 7 is irreducible, W is {0} or V, and we conclude that II is irre- 
ducible. This establishes Point 1 of the proposition. 

Point 2 of the proposition is similar and is left as an exercise to the reader 

(Exercise 1). o 


Proposition 4.6. Let g be a real Lie algebra and gc its complezification. 
Then, every finite-dimensional complex representation x of g has a unique 
extension to a complex-linear representation of gc, also denoted 7 and given 
by 

n(X +iY) =27(X)+in(Y) 


for all X,Y € g. Furthermore, n is irreducible as a representation of gc if 
and only if it is irreducible as a representation of g. 


Proof. The existence and uniqueness of the extension are trivial and follow 
from Exercise 23 of Chapter 2. 

Concerning irreducibility, let us make sure that we are clear about what 
the statement means. Suppose that 7 is a complex representation of the real 
Lie algebra g, acting on the complex vector space V. Then, saying that m 
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is irreducible means that there is no nontrivial invariant complex subspace 
W CV. That is, even though g is a real Lie algebra, when considering complex 
representations of g, we are interested only in complex invariant subspaces. 

Now, suppose that m is irreducible as a representation of g. If W is a 
complex subspace of V which is invariant under gc, then, certainly, W is 
invariant under g C gc. Therefore, W = {0} or W = V. Thus, ~ is irreducible 
as a representation of gc. 

On the other hand, suppose that m is irreducible as a representation of 
gc and suppose that W is a complex subspace of V which is invariant under 
g. Then, W will also be invariant under 7(X + iY) = 7(X) + ia(Y), for all 
X,Y € g. Since every element of gc can be written as X + iY, we conclude 
that, in fact, W is invariant under gc. Thus, W = {0} or W = V and 7 is 
irreducible as a representation of g. o 


Definition 4.7. Let G be a matriz Lie group, let H be a Hilbert space, and 
let U(H) denote the group of unitary operators on H. Then, a homomorphism 
Il: G > U(H) is called a unitary representation of G if II satisfies the 
following continuity condition: If An, A E€ G and An > A, then 


TI(A, )v > TI(A)v 


for allv € H. A unitary representation with no nontrivial closed invariant 
subspaces is called irreducible. 


This continuity condition is called strong continuity. One could require 
the even stronger condition that ||II(A,) — II(A)|| — 0, but this turns out to 
be too stringent a requirement. (That is, most of the interesting unitary rep- 
resentations of G will not have this stronger continuity condition.) In practice, 
any homomorphism of G into U(H) that one can write down explicitly will 
be strongly continuous. 

Note here that H is not assumed to be finite dimensional. Although we will 
deal in this book almost exclusively with finite-dimensional representations, 
it is good to be aware of the concept of infinite-dimensional unitary represen- 
tations. If H is infinite dimensional, there are many technical issues that we 
will not be able to delve into in this book. For example, the correct notion 
of a Lie algebra representation associated to an infinite-dimensional unitary 
representation is quite subtle and we will not address this issue at all. Nev- 
ertheless, see Exercise 8 for a calculation of such a Lie algebra representation 
(in which all technical difficulties are swept under the carpet). 


4.2 Why Study Representations? 


If a representation II is a faithful representation of a matrix Lie group G, 
then {II(A)|A € G} is a group of matrices that is isomorphic to the original 
group G. Thus, II allows us to represent G as a group of matrices. This is 
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the motivation for the term “representation.” (Of course, we still call II a 
representation even if it is not faithful.) 

Despite the origin of the term, the point of representation theory is not (at 
least in this book) to represent a group as a group of matrices. After all, all 
of our groups are already matrix groups! Although it might seem redundant 
to study representations of a group which is already represented as a group 
of matrices, this is precisely what we are going to do. 

The reason for this is that a representation can be thought of (as we have 
already noted) as an action of our group on some vector space. Such actions 
(representations) arise naturally in many branches of both mathematics and 
physics, and it is important to understand them. 

A typical example would be a differential equation in three-dimensional 
space which has rotational symmetry. If the equation has rotational symmetry, 
then the space of solutions will be invariant under rotations. Thus, the space 
of solutions will constitute a representation of the rotation group SO(3). If one 
knows what all of the representations of SO(3) are, this can help immensely 
in narrowing down what the space of solutions can be. (As we will see, SO(3) 
has many other representations besides the obvious one in which SO(3) acts 
on R°.) 

In fact, one of the chief applications of representation theory is to exploit 
symmetry. If a system has symmetry, then the set of symmetries will form a 
group, and understanding the representations of the symmetry group allows 
one to use that symmetry to simplify the problem. 

In addition, studying the representations of a group G (or of a Lie algebra 
g) can give information about the group (or Lie algebra) itself. For example, 
if G is a finite group, then associated to G is something called the group 
algebra. The structure of this group algebra can be described very nicely in 
terms of the irreducible representations of G. 

In this book, we will be interested primarily in computing the finite- 
dimensional irreducible complex representations of matrix Lie groups. As we 
shall see, this problem can be reduced almost completely to the problem of 
computing the finite-dimensional irreducible complex representations of the 
associated Lie algebra. In this chapter, we will discuss the theory at an ele- 
mentary level and will consider in detail the examples of SO(3) and SU(2). In 
Chapter 5, we will study the representations of SU(3), which is similar to but 
more involved than that of SU(2). In Chapter 7, we will look at the general 
theory of representations of semisimple groups. 


4.3 Examples of Representations 


4.3.1 The standard representation 


A matrix Lie group G is, by definition, a subset of some GL(n;C). The in- 
clusion map of G into GL(n;C) (ie., IIA) = A) is a representation of G, 
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called the standard representation of G. If G happens to be contained in 
GL(n;R) C GL(n;C), then we can think of the standard representation as a 
real representation if we prefer. Thus, for example, the standard representa- 
tion of SO(3) is the one in which SO(3) acts in the usual way on R? and the 
standard representation of SU(2) is the one in which SU(2) acts on C? in the 
usual way. If G is a subgroup of GL(n;R) or GL(n; C), then its Lie algebra g 
will be a subalgebra of gl(n; R) or gl(n;C). The inclusion of g into gl(n; R) or 
gl(n;C) is a representation of g, called the standard representation. 


4.3.2 The trivial representation 


Consider the one-dimensional complex vector space C. Given any matrix Lie 
group G, we can define the trivial representation of G, II: G > GL(1;C), 
by the formula 

TI(A) =I 


for all A € G. Of course, this is an irreducible representation, since C has 
no nontrivial subspaces, let alone nontrivial invariant subspaces. If g is a Lie 
algebra, we can also define the trivial representation of g, 7: g > gl(1;C), 
by 

m(X) =0 


for all X € g. This is an irreducible representation. 


4.3.3 The adjoint representation 


Let G be a matrix Lie group with Lie algebra g. We have already defined the 
adjoint mapping 
Ad: G > GL(g) 


by the formula 
Ada(X) = AXA. 


Recall that “Ad” is a Lie group homomorphism. Since Ad is a Lie group 

homomorphism into a group of invertible operators, we see that, in fact, Ad 

is a representation of G, acting on the space g. Thus, we can now give Ad its 

proper name, the adjoint representation of G. The adjoint representation is 

a real representation of G. (If g happens to be a complex subspace of Mn (C), 

then we can think of the adjoint representation as a complex representation.) 
Similarly, if g is a Lie algebra, we have 


ad : g > gl(g), 


defined by the formula 
adx(Y) = [X,Y]. 


We know that “ad” is a Lie algebra homomorphism and is, therefore, a rep- 
resentation of g, called the adjoint representation. In the case that g is 
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the Lie algebra of some matrix Lie group G, we have already established 
(Chapter 2, Proposition 2.24 and Exercise 19) that Ad and ad are related by 
exp(adx ) = Adex : 

Note that in the case of SO(3), the standard representation and the adjoint 
representation are both three dimensional real representations. In fact, these 
two representations are equivalent (Exercise 3). 


4.3.4 Some representations of SU(2) 


Consider the space Vm of homogeneous polynomials in two complex variables 
with total degree m (m > 0); that is, Vm is the space of functions of the form 


f(z, 22) = a027 + nig z + age "28 +++ Om zy" (4.1) 


with 21, z2 € C and the a,;’s arbitrary complex constants. The space Vm is an 
(m + 1)-dimensional complex vector space. 

Now, by definition, an element U of SU(2) is a linear transformation of 
C?. Let z denote the pair z = (21,22) in C?. Then, we may define a linear 
transformation II,,(U) on the space Vm by the formula 


(Hm (U)f] (2) = (Uz). (4.2) 
Explicitly, if f is as in (4.1), then 


(Wn (U) f] (21, 22) = Sak (Ug za ae Urte) (Uz z + Uza) 
k=0 


By expanding out the right-hand side of this formula, we see that IIm(U)f 
is again a homogeneous polynomial of degree m. Thus, IIm(U) actually maps 
Vm into Vm. 

Now, compute 


Im (U1) [Hm (U2) f] (2) = [Tn (U2) f] (Ur *2) = f (Uz UT *z) 
= Tn (U1 U2) f(z). 


Thus, Ilm is a (finite-dimensional complex) representation of SU(2). The in- 
verse in (4.2) is necessary in order to make IIm a representation. We will see 
eventually that each of the representations Im of SU(2) is irreducible and 
that every finite-dimensional irreducible representation of SU(2) is equivalent 
to one (and only one) of the IIm’s. (Of course, no two of the Im’s are equiv- 
alent, since they do not even have the same dimension.) 

Let us now compute the corresponding Lie algebra representation Tm. 
According to Proposition 4.4, nm can be computed as 


d 
Tm(X) = Fm (e™*) 
t=0 
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So, 
a ae 
(tm(X)F)(2)= Sie). 
t=0 
Now, let z(t) be the curve in C? defined as z(t) = e~'* z, so that z(0) = z. 
Of course, z(t) can be written as z(t) = (z1(t), z2(t)), with z;(t) € C. By the 
chain rule, 


Of dz, Of dz 
m(X)f = — — — — : 
Ta X) Oz dt |o O22 dt hizo 
However, dz/dt|,_. = —Xz, so we obtain the following formula for mm(X): 
o o 
Tm (X)f = n (Xi121 + X 1222) = OF (Xo121 + X9222) x (4.3) 
ZY Oz 


Now, according to Proposition 4.6, every finite-dimensional complex rep- 
resentation of the Lie algebra su(2) extends uniquely to a complex-linear rep- 
resentation of the complexification of su(2). However, the complexification of 
su(2) is (isomorphic to) sl(2;C) (Proposition 2.45). The representation mm of 
su(2) given by (4.3) thus extends to a representation of sl(2;C), which we will 
also call mm and which (as is easily verified) is also given by (4.3). 

So, for example, consider the element 


ay 


in the Lie algebra sl(2;C). Applying formula (4.3) gives 


m(H =--> 
(mA (2) = shar + La 
Thus, we see that 
ð 
H) = -2,—— —. 4.4 

Tm(H) a Oz ae Oz2 24) 

Applying 1,,(H) to a basis element z*23"~*, we get 
Tm(H)zt ey * = —kzk2—* + (m — k) zk —* = (m — 2k)zf 270, 


Thus, 21277" is an eigenvector for t(H) with eigenvalue (m — 2k). In par- 


ticular, 7,(H) is diagonalizable. 
Let X and Y be the elements 


01 00 
eloa = Cio) 
in sl(2; C). Then, (4.3) tells us that 


ð o 
Tm(X) = 2a. Tm(Y) = - 24477 
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so that 


Tm Se PS heh ge kE, 


pe OL a amie = (k — (me (4.5) 


Proposition 4.8. The representation Tm is an irreducible representation of 
sl(2;C). 


Proof. It suffices to show that every nonzero invariant subspace of Vm is, in 
fact, equal to Vm. So, let W be such a space. Since W is assumed nonzero, 
there is at least one nonzero element w in W. Then, w can be written uniquely 
in the form 


w = agzy + azz zo + age P28 +++ amz 


with at least one of the a,’s nonzero. Let ko be the smallest value of k for 
which a, Æ 0 and consider 
Tm (X)™— "ow. 


Since (by (4.5)) each application of 7,(X) lowers the power of z by 1, 
Tm(X)™—*o will kill all the terms in w except ag,27”~*°z*. On the other 
hand, we compute easily that 


e a 2h) = (—1)™—#9 (mm — ho) 28 


We see, then, that m,(X)"—*"°w is a nonzero multiple of 23". Since W is 
assumed invariant, W must contain z}. Furthermore, it follows from (4.5) 
that tm(Y)*2%" is a nonzero multiple of z¥27 7". Therefore, W must also 
contain geen ® for all 0 < k < m. Since these elements form a basis for Vm, 


we see that, in fact, W = Vm, as desired. o 


4.3.5 Two unitary representations of SO(3) 


Let H = L? (R3, dx), the space of square-integrable functions on R3. For each 
R € SO(3), define an operator I (R) on H by the formula 


(R) f] (2) = f (R72). 


Since Lebesgue measure dz is rotationally invariant, I (R) is a unitary opera- 
tor for each R € SO(3). The calculation of the previous subsection shows that 
the map R —> II,(R) is a homomorphism of SO(3) into U(H). This map is 
strongly continuous and hence constitutes a unitary representation of SO(3). 

Similarly, we may consider the unit sphere S? C R3, with the usual surface 
measure ©. Of course, any R € SO(3) maps S? into S?. For each R, we can 
define II2(R) acting on L?(S?,dQ) by 


(M2(R)f] (x) = f (Roz) . 
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Then, I, is a unitary representation of SO(3). 

Neither of the unitary representations IT, and H is irreducible. In the case 
of Iz, L?(S?,dQ) has a very nice decomposition as the orthogonal direct sum 
of finite-dimensional invariant subspaces. This decomposition is the theory of 
“spherical harmonics,” which are well known in the physics (and mathematics) 
literature. 


4.3.6 A unitary representation of the reals 
Let H = L?(R, dz). For each a € R, define Ta : H > H by 


(Taf) () = f(a —a). 


Clearly, Ta is a unitary operator for each a € R and, clearly, TaT, = Tato. 
The map a — Ta is strongly continuous, so T is a unitary representation of 
R. This representation is not irreducible. The theory of the Fourier transform 
allows one to determine all the closed, invariant subspaces of H (Theorem 
9.17 of Rudin (1987)). 


4.3.7 The unitary representations of the Heisenberg group 
Consider the Heisenberg group 


lab 
H= 01c | la,b,cER 
001 


Now, consider a real, nonzero constant, which, for reasons of historical con- 
vention, we will call A (“h bar”). Now, for each A € R\{0}, define a unitary 
operator IT, on L?(R, dx) by 


lab 
In| Ol c | f= e Pe f(x — a). (4.6) 
001 


It is clear that the right-hand side of (4.6) has the same norm as f, so Ia is, 
indeed, unitary. 
Now, compute 


1ab lab 
Maj 01 |El O1c |f 
001 001 


= Ge he i eiheli= a)i 


x—-a@—a) 


— pv ih(b+b+ca) eih(ere)e p (x —(@+a)). 
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This shows that the map A — Ia (A) is a homomorphism of the Heisenberg 
group into U (L?(R)). This map is strongly continuous and, so, Ip is a unitary 
representation of H. 

Note that a typical unitary operator II,(A) consists of first translating f, 
then multiplying f by the function e’”°*, and then multiplying f by the con- 
stant e~*>. Multiplying f by the function e*”** has the effect of translating 
the Fourier transform of f, or, in physical language, “translating f in momen- 
tum space.” Now, if U; is an ordinary translation and U% is a translation of 
the Fourier transform (i.e., U2 = multiplication by some e*”**), then U; and 
Uz will not commute, but UjU2U; "Us: 1 will be simply multiplication by a 
constant of absolute value one. Thus, {II,(A)|A € H } is the group of opera- 
tors on L?(R) generated by ordinary translations and translations in Fourier 
space. It is this representation of the Heisenberg group which motivates its 
name. (See also Exercise 8.) 

It follows fairly easily from standard Fourier transform theory (e.g., The- 
orem 9.17 of Rudin (1987)), that for each A € R\{0}, the representation I, 
is irreducible. Furthermore, these are (up to equivalence) almost all of the 
irreducible unitary representations of H. The only remaining ones are the 
one-dimensional representations Ha,g given by 


lab 
Hoa | Ole | = eletbe] 
001 


with a, € R. (The I,,¢’s are the irreducible unitary representations in 
which the center of H acts trivially.) The fact that the II,’s and the Ilg,g’s 
are all of the (strongly continuous) irreducible unitary representations of H is 
closely related to the celebrated Stone-Von Neumann theorem in mathemat- 
ical physics. See, for example, Reed and Simon (1979), Theorem XI.84. See 
also Exercise 9. 


4.4 The Irreducible Representations of su(2) 


In this section, we will compute (up to equivalence) all of the finite-dimensional 
irreducible complex representations of the Lie algebra su(2). This computa- 
tion is important for several reasons. In the first place, su(2) & so(3) and the 
representations of so(3) are of physical significance. (The computation we will 
do here is found in every standard textbook on quantum mechanics, under 
the heading “angular momentum.”) In the second place, the representation 
theory of su(2) is an illuminating example of how one uses commutation rela- 
tions to determine the representations of a Lie algebra. In the third place, in 
determining the representations of semisimple Lie algebras (Chapters 5 and 
6), we will explicitly use the representation theory of su(2). 

Now, every finite-dimensional complex representation m of su(2) extends 
by Proposition 4.6 to a complex-linear representation (also called 7) of the 
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complexification of su(2), namely sl(2;C). (Recall Section 2.9.) The exten- 
sion of m to sl(2;C) is irreducible if and only if the original representation is 
irreducible, again by Proposition 4.6. 

We see, then, that studying the irreducible representations of su(2) is 
equivalent to studying the irreducible (complex-linear) representations of 
sl(2;C). Passing to the complexified Lie algebra makes our computations eas- 
ier, in that we can find a nice basis for sl(2;C) that has no counterpart among 
the bases of su(2). 

We will use the following basis for sl(2;C): 


10 01 00 
w= (05 )sx=(G0)¥= (2): 


which have the commutation relations 


[H,X]= 2X, 
[H, Y] = -2Y, 
[X,Y]= H. 


If V is a (finite-dimensional complex) vector space and A, B, and C are 
operators on V satisfying 


[A,B] = 2B, 
[A, C] = -2C, 
[B, C] = A, 


then because of the skew symmetry and bilinearity of brackets, the linear map 
m:sl(2;C) > gl(V) satisfying 


n(H) = A, 7(X) = B, n(Y)= C 
will be a representation of sl(2;C). 


Theorem 4.9. For each integer m > 0, there is an irreducible representation 
of sl(2;C) with dimension m+1. Any two irreducible representations of s|(2;C) 
with the same dimension are equivalent. If m is an irreducible representation 
of s\(2;C) with dimension m + 1, then n is equivalent to the representation 
Tm described in Section 4.8. 


Proof. Let m be an irreducible representation of sl(2;C) acting on a (finite- 
dimensional complex) space V. Our strategy is to diagonalize the operator 
w(H). Of course, a priori, we do not know that 1(H) is diagonalizable. How- 
ever, because we are working over the (algebraically closed) field of complex 
numbers, 7(H) must have at least one eigenvector. 

The following lemma is the key to the entire proof. 


Lemma 4.10. Let u be an eigenvector of n(H) with eigenvalue a € C. Then, 


n(H)n(X)u = (a + 2)n(X)u. 
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Thus, either n(X)u = 0 or r(X)u is an eigenvector for t(H) with eigenvalue 
at+2. Similarly, 
m(H)n(Y )u = (a — 2)n(Y Ju 


so that either n(Y )u = 0 or n(Y )u is an eigenvector for x(H) with eigenvalue 
a— 2. 


Proof. We call 1(X) the “raising operator,” because it has the effect of raising 
the eigenvalue of 7(H) by 2, and we call n(Y) the “lowering operator.” We 
know that [7(H),1(X)| = 7 ({H, X]) = 27(X). Thus, 


m(H)n(X) — 1(X)n(A) = 27(X) 


or 
n(Hyn(X) = n(X)n(A) + 27(X). 
Thus, 
n(H)x(X)u = n(X)n(Hju + 20(X)u 
= 1(X) (au) + 2n(X)u 
= (a+ 2)n(X)u. 
Similarly, [7(H),7(Y)] = —27(Y), and, so, 
n(H)n(Y) = r(Y )n(H)-—27(Y) 
so that 
m(H)n(Y )u = n(Y \n(H)u — 27(Y )u 
=7(Y) (au) — 27(Y)u 
(a — 2)r(Y)u 
This is what we wanted to show. m 


As we have observed, 7(H) must have at least one eigenvector u (u Æ 0), 
with some eigenvalue a € C. By the lemma, 


m(H)n(X)u = (a+ 2)r(X)u 
and, more generally, 
n(H)n(X)"u = (a + 2n)r(X)"u. 


This means that either 7(X)"u = 0 or 7(X)”u is an eigenvector for 7(H) 
with eigenvalue a + 2n. 

Now, an operator on a finite-dimensional space can have only finitely many 
distinct eigenvalues. Thus, the 7(X)”u’s cannot all be different from zero. 
Thus, there is some N > 0 such that 
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m(X)%u 40 


but 
a(X)¥*1y = 0. 


Define uo = 7(X)Nu and A = a+ 2N. Then, 


1(H)ug = Auo, (4.7) 
7(X)uo = 0. (4.8) 


Then, define 
uk = 7(Y)*ug 


for k > 0. By the second part of the lemma, we have 
1(H)up = (A ae 2k) Uk: (4.9) 


Since, again, 7(H) can have only finitely many eigenvalues, the u,’s cannot 
all be nonzero. 


Lemma 4.11. With the above notation, 


m(X)ug = [kà —k(k—1)Jup_1 (k > 0), 
T (X) ugo = 0. 


Proof. We proceed by induction on k. In the case k = 1, we note that u = 
n(Y )uo. Using the commutation relation [7(X),7(Y)] = (H), we have 


m(X)uy = 1(X)n(Y juo = (1(Y)a(X) + 7(H)) uo. 
However, 7(X )ug = 0, so we get 
1(X )uy = Auo, 


which is the lemma in the case k = 1. 
Now, by definition, ux+ı = 7(Y)ux. Using (4.9) and induction, we have 


W(X Juggs = 7(X)r(Y Juk 
= (a(Y)m(X) + (A) ur 
m(Y)[kA — k(k — 1)]ur-1 + (A — 2k) ue 
eee 1) + (A— 2k) Jun. 


Simplifying the last expression gives the lemma. o 


Since 7(H) can have only finitely many eigenvalues, the ug’s cannot all be 
nonzero. There must, therefore, be a non-negative integer m such that 


up = 7(Y)*up 40 
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for all k < m, but 
ting = AY ig = 0. 


Now, if um+1 = 0, then, certainly, 7(X)um+1 = 0. Then, by Lemma 4.11, 
0 = T(X)um+1 = [(m + 1)A—m(m + 1)] um = (m + 1)(A-m)um. 


However, um # 0 and m + 1 Æ 0 (since m > 0). Thus, in order to have 
(m+1)(A — Mm)um equal to zero, we must have Aà = m, where m is a non- 
negative integer. (This also shows that the eigenvalue of 7(H) that we started 
with, a = A — 2N, must be an integer.) 

We have made considerable progress. Given a finite-dimensional irreducible 
representation 7 of sl(2;C), acting on a space V, there exists an integer m > 0 


and nonzero vectors uo, ..., Um such that (putting À equal to m) 

w(H)ux = (m — 2k)ur, 

n(Y )uk =uk+1 (k <m), 

n(Y )um = 0, 

n(X)up = |km —k(k — 1)]jux-ı (k > 0), 

1(X Jug = 0. (4.10) 

The vectors ug,..., Um must be linearly independent, since they are eigen- 

vectors of 7(H) with distinct eigenvalues (Proposition B.1). Moreover, the 
(m + 1)-dimensional span of uo,...,Um is explicitly invariant under m(H), 


m(X), and n(Y) and, hence, under 7(Z) for all Z € sl(2;C). Since 7 is irre- 
ducible, this space must be all of V. 

We have now shown that every irreducible representation of sl(2;C) is of 
the form (4.10). It remains to show that everything of the form (4.10) is a 
representation and that it is irreducible. That is, if we define 7(H), m(X), and 
n(Y) by (4.10) (where the ux’s are basis elements for some (m+1)-dimensional 
vector space), then we want to show that they have the right commutation 
relations to form a representation of sI(2;C) and that this representation is 
irreducible. One way to do this is to show that the representations mm con- 
structed in the previous section have a basis of the form (4.10). Alternatively, 
we can directly check that operators defined as in (4.10) really do satisfy the 
sl(2;C) commutation relations (Exercise 4), and then prove irreducibility in 
the same way as in the proof of Proposition 4.8. 

We have now shown that there is an irreducible representation of sl(2;C) 
in each dimension m + 1, by writing explicitly (in (4.10)) how H, X, and Y 
should act in a basis. However, we have shown more than this. We also have 
shown that any (m+1)-dimensional irreducible representation of sI(2;C) must 
be of the form (4.10). It follows that any two irreducible representations of 
sl(2;C) of dimension (m + 1) must be equivalent, for if mı and m2 are two 
irreducible representations of dimension (m + 1), acting on spaces V, and 
V2, then Vı has a basis uo,...,Um as in (4.10) and V2 has a similar basis 
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Ùo, .. ., Um. However, then the map ¢: V; V2 which sends uz to Ux will be 
an isomorphism of representations, as a moment’s thought will confirm. 

In particular, the (m + 1)-dimensional representation mm described in Sec- 
tion 4.3 must be equivalent to (4.10). This can be seen explicitly by introducing 
the following basis for Vm: 


Then, by definition, nm(Y Juk = uk+1 (k < m), and it is clear that nm(Y Jum = 
0. It is easy to see that tm(H)ux = (m — 2k)ux. The only thing left to check 
is the behavior of 7,,(X). However, direct computation shows that 


Tm (X)up = k(m — k + 1)ug—-1 = [km — k(k — 1)Jur-1, 


as required. 
This completes the proof of Theorem 4.9. o 


If we look carefully at the proof of Theorem 4.9, we see that the argument 
can tell us something about finite-dimensional, not necessarily irreducible rep- 
resentations of sl(2; C). In particular, up to and including (4.10), the argument 
does not use irreducibility, which is used only to show that the vectors in (4.10) 
span V. Thus, we obtain the following result about arbitrary finite-dimensional 
representations of sl(2;C). 


Theorem 4.12. Suppose m is any finite-dimensional, complez-linear repre- 
sentation of sl(2;C) acting on a space V. Then, we have the following results: 


1. Every eigenvalue of n(H) must be an integer. 

2. If v is a nonzero element of V such that r(X)v = 0 and r(H)v = Xv, 
then there is a non-negative integer m such that A = m. Furthermore, the 
vectors v, n(Y \w,... n(Y )™v are linearly independent and their span is 
an irreducible invariant subspace of dimension m + 1. 


4.5 Direct Sums of Representations 


One way of generating representations is to take some representations one 
knows and combine them in some fashion. In this section and the next two, 
we will consider the three standard methods of obtaining new representations 
from old, namely direct sums of representations, tensor products of represen- 
tations, and dual representations. 


Definition 4.13. Let G be a matriz Lie group and let Ili, I2, ..., Im be rep- 
resentations of G acting on vector spaces V\,V2,...,Vm- Then, the direct 
sum of I1,,Te,..., Ilm is a representation I, ®---@ In of G acting on the 
space Vi ®---@Vin, defined by 
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[I @--- Bn (A)] (v1,...,0m) = (Th (A)ur,.--, Im(A) um) 


for all AEG. 
Similarly, if g is a Lie algebra, and 7,72,...,7m are representations of 
g acting on Vi, V2,...,Vm, then we define the direct sum of 1, T2,..., nm, 


acting on Vi @---BVin by 
[ni E- BAm(X)] (1, -.-, Um) = (M1 (X)v1,..., tm (X)vm) 
for all X €g. 


It is straightforward to check that, say, Ili @---@ Um is really a represen- 
tation of G. 

An important property that some matrix Lie groups and Lie algebras have 
is the complete reducibility property. This means that every finite-dimensional 
representation is isomorphic to a direct sum of irreducible representations. 
For such groups, once we know all the irreducible representations, we know 
all the representations. By no means do all groups have this property. We will 
discuss this issue further in Section 4.10 of this chapter and in Chapter 6. 


4.6 Tensor Products of Representations 


Let U and V be finite-dimensional real or complex vector spaces. We wish to 
define the tensor product of U and V, which will be a new vector space 
UQV “built” out of U and V. We will discuss the idea of this first and then 
give the precise definition. 

We wish to consider a formal “product” of an element u of U with an 
element v of V, denoted u & v. The space U & V is then the space of linear 
combinations of such products, that is, the space of elements of the form 


aiui Q Vy + aque Q V2 ++++ + anUn Q Un. (4.11) 


Of course, if “®” is to be interpreted as a product, then it should be bilinear; 
that is, we should have 


(uy + aug) 8 v = u, @u+ aug Qv, 
uQ (v1 + ave) = u Q vı + au 8 vo. 


We do not assume that the product is commutative. (In fact, the product in 
the other order, v @ u, is in a different space, namely V @ U.) 

Now, if e1,€2,...,€n is a basis for U and fi, fo,...,; fm is a basis for V, 
then, using bilinearity, it is easy to see that any element of the form (4.11) 
can be written as a linear combination of the elements e; ® fj. In fact, it 
seems reasonable to expect that {e; ® f;|1<i<n,1<j<m} should be a 
basis for the space U @ V. This, in fact, turns out to be the case. 
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Definition 4.14. If U and V are finite-dimensional real or complex vector 
spaces, then a tensor product of U with V is a vector space W, together 
with a bilinear map 6: U x V > W with the following property: If w is any 
bilinear map of U x V into a vector space X, then there exists a unique linear 
map p of W into X such that the following diagram commutes: 


Uxv 4% wW 
YN s Y: 
x 


_ Note that the bilinear map y from U x V into X turns into the linear map 


w of W into X. This is one of the points of tensor products: Bilinear maps on 
U x V turn into linear maps on W. 


Theorem 4.15. [fU and V are any finite-dimensional real or complex vector 
spaces, then a tensor product (W, ġ) exists. Furthermore, (W, ġ) is unique up 
to canonical isomorphism. That is, if (W1,¢1) and (W2,¢2) are two tensor 
products, then there exists a unique vector space isomorphism ® : W, —> W2 
such that the following diagram commutes: 


uxv $% w 


dN Le. 
W2 


Suppose that (W, p) is a tensor product and that €1,€2,...,€n is a basis for 
U and fi, fo,..., fm is a basis for V. Then, {ġ(ei, fi) |L <i<n, 1 <j <m} 
is a basis for W. 


Proof. Exercise 10. 0 


Notation 4.16 Since the tensor product of U and V is essentially unique, 
we will let U & V denote an arbitrary tensor product space and we will 
write u & v instead of ọ(u,v). In this notation, Theorem 4.15 says that 
{eae D fill <i<n,1 <j <m} is a basis for U V, as expected. Note in 
particular that 

dim (U ® V) = (dim U) (dim V) 


(not dim U + dim V). 


The defining property of U @V is called the universal property of tensor 
products. Although it may seem that we are taking a simple idea and making 
it confusing, in fact there is a point to this universal property. Suppose we 
want to define a linear map T from U & V into some other space. The most 
sensible way to define this is to define T on elements of the form u & v. 
(We might try defining it on a basis, but this would force us to worry about 
whether things depend on the choice of basis.) Now, every element of U @V is 
a linear combination of things of the form u & v. However, this representation 
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is far from unique. (Since, say, if u = u1 + u2, then one can rewrite u &® v as 
u, Du +u Qv.) 

Thus, if we try to define T by what it does to elements of the form u 8 v, 
we have to worry about whether T is well defined. This is where the universal 
property comes in. Suppose that w(u,v) is some bilinear expression in (u, v). 
Then, the universal property says precisely that there is a unique linear map 
T (= %) such that 

T(u 8v) = y(u, v). 


The conclusion is this: We can define a linear map T on U @ V by defining 
it on elements of the form u ® v, and this will be well defined, provided that 
T(u&v) is bilinear in (u, v). The following proposition illustrates how to make 
use of this idea. 


Proposition 4.17. Let U and V be finite-dimensional real or complex vector 
spaces. Let A: U 4 U and B: V - V be linear operators. Then, there exists 
a unique linear operator from U @V to U @V, denoted A® B, such that 


(A8 B)(u® v) = (Au) 8 (Bv) 


for allucU andveV. 
If Ay and Av are linear operators on U and Bı and Bz are linear operators 
on V, then 
(Ay ® Bı) (A2 ® B2) = (A142) 8 (Bı B2) : 


Proof. Define a map w from U x V into U @V by 
w(u,v) = (Au) 8 (Bv). 


Since A and B are linear and since ® is bilinear, y will be a bilinear map of 
U x V into U & V. However, then the universal property says that there is an 
associated linear map 4% : U & V > U & V such that 


plu 8v) = y(u, v) = (Au) 8 (Bv). 


Then, w is the desired map A ® B. 
Now, if A; and Ag are operators on U and Bı and Bz are operators on V, 
then compute that 


(A; Q B1) (A2 ® Bz) (u 8 v) = (Ai ® Bi) (Agu ® B2v) 
= A; Agu ® B; Bav. 


This shows that (A; ® B1) (A2 & Bg) = (A142) ® (Bi Bz) are equal on ele- 
ments of the form u@v. Since every element of UQV can be written as a linear 
combination of things of the form u®v (in fact, of e;®@f;), (A1 ® Bi) (42 ® Be) 
and (A;A2) Q (B, Bz) must be equal on the whole space. o 
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We are now ready to define tensor products of representations. There are 
two different approaches to this, both of which are important. The first ap- 
proach starts with a representation of a group G acting on a space V and 
a representation of another group H acting on a space U and produces a 
representation of the product group G x H acting on the space U & V. The 
second approach starts with two different representations of the same group 
G, acting on spaces U and V, and produces a representation of G acting on 
U & V. Both of these approaches can be adapted to apply to Lie algebras. 


Definition 4.18. Let G and H be matriz Lie groups. Let Il, be a represen- 
tation of G acting on a space U and let Ila be a representation of H acting 
on a space V. Then, the tensor product of Il, and II, is a representation 
II, © Iz of G x H acting on U ® V defined by 


Tl, @ H(A, B) = I: (A) 8 I2(B) 
forall A€ G and BE H. 


Using the above proposition, it is easy to check that, indeed, I ® I is a 
representation of G x H. 

Now, if G and H are matrix Lie groups (i.e., G is a closed subgroup 
of GL(n;C) and H is a closed subgroup of GL(m;C)), then G x H can be 
regarded in an obvious way as a closed subgroup of GL(n + m; C). Thus, the 
direct product of matrix Lie groups can be regarded as a matrix Lie group. 
It is easy to check that the Lie algebra of G x H is isomorphic to the direct 
sum of the Lie algebra of G and the Lie algebra of H. 

In light of Proposition 4.4, the representation I Q I of G x H gives rise 
to a representation of the Lie algebra of G x H, namely g ®h. The following 
proposition shows that this representation of g ® b is not what one might 
expect at first. 


Proposition 4.19. Let G and H be matrix Lie groups, let Il; and Ilo be 
representations of G and H, respectively, and consider the representation Il, ® 
H of Gx H. Let 71 ®m2 denote the associated representation of the Lie algebra 
of Gx H, namely g® b. Then, for all X € g and Y €b, 


Tı D 72(X, VY) =m (X) 9I +I 8 mY). 


Proof. Suppose that u(t) is a smooth curve in U and v(t) is a smooth curve 
in V. Then, we verify the product rule in the usual way: 


lim u(t +h) 8 v(t +h) — u(t) 8 v(t) 

h>0 h 

— him WET) Butt h) -ult+h) @ v(t) , ult + h) 8 vlt) - ult) 8 v(t) 
gen h h 


(u(t + h) — u (t)) 


= lim u(t+ hy @ CEER IO] + im | h 


t)i. 
h-0 aul) 
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Thus, 


This being the case, we can compute 7 Q 7(X,Y): 


d 
Tı @m2(X,Y)(u@v) = zh @ IIp(e*, eY )(u@v) 


t=0 


Z Smeu 8 Tla(eY v 


_(@ tX 
= (ime Ju 


t=0 


) Gv+u® (ime 
peg dt 


P 


This shows that 7 ® m2(X,Y) = mı(X) 8 I + I Q 72(Y) on elements of the 
form u ® v and, therefore, on the whole space U & V. (m 


Definition 4.20. Let g and h be Lie algebras and let 7; and To be represen- 
tations of g and h, acting on spaces U and V. Then, the tensor product of 
Tı and 72, denoted Tı Q T2, is a representation of g ® b acting on U @V, 
given by 

Tı @ (X,Y) =m (X) QI +I mY) 
for all X € g and Y €b. 

It is easy to check that this indeed defines a representation of g @® b. Note 
that if we defined 7; ® m2(X,Y) = mı(X) @ 72(Y), this would not be a 
representation of g @ 6, for this is not even a linear map (e.g., we would then 
have mı Q 72(2X,2Y) = 4m Q 72(X,Y)). Note also that the above definition 


applies even if mı and 72 do not come from a representation of any matrix Lie 
group. 


Definition 4.21. Let G be a matrix Lie group and let Il; and I> be repre- 
sentations of G, acting on spaces Vy and Vz. Then, the tensor product of 
II, and Ilo is a representation of G acting on Vi ® V2 defined by 


for all AEG. 


Proposition 4.22. With the above notation, the associated representation of 
the Lie algebra g satisfies 


Ty @ 12(X) =7(X) @I+1® m2(X) 


for all X €g. 
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Proof. Using the product rule, 


d 
Tı D 12(X)(u@v) = a 
t=0 


mı (X)U@vu+v@m(X)u. 


Il; (e*) u 8 Il (e'*)v 


II 


This is what we wanted to show. o 


Definition 4.23. If g is a Lie algebra and mı and T2 are representations of 
g acting on spaces V, and V2, then the tensor product of mı and m2 is a 
representation of g acting on the space Vi Q V2 defined by 


Ti @ 12(X) =m (X) @1+1@ m2(X) 
for all X €g. 


It is easy to check that II; @ Hz and 7; Q72 are actually representations of 
G and g, respectively. There is some ambiguity in the notation, say, Il Q Ib. 
After all, even if II, and Tz are both representations of the same group G, we 
could still regard II; ® Iz as a representation of G x G, by taking H = G in 
Definition 4.18. We will rely on context to make clear whether we are thinking 
of IT, @ I as a representation of G x G or as representation of G. 

Suppose II; and Iz are irreducible representations of a group G. If we 
regard IT, Q&I as a representation of G, it may no longer be irreducible. If it is 
not irreducible, one can attempt to decompose it as a direct sum of irreducible 
representations. This process is called the Clebsch—Gordan theory. In the 
case of SU(2), this theory is relatively simple. (In the physics literature, the 
problem of analyzing tensor products of representations of SU(2) is called 
“addition of angular momentum.”) See Exercise 11 and Appendix D. 


4.7 Dual Representations 


Suppose that a is a representation of a Lie algebra g acting on a finite- 
dimensional vector space V. Let V* denote the dual space of V, that is, the 
space of linear functionals on V. (See Section B.7.) If A is a linear operator 
on V, let A* denote the dual or transpose operator on V* , 


(4$) (v) = $ (Av) 


for 6 € V*, v € V. If v1,...,Un is a basis for V, then there is a naturally 
associated “dual basis” 1,...,@n with the property that (vı) = ôkı. Then, 
the matrix for A in the dual basis is simply the transpose (in the usual 
matrix sense) of the matrix of A in the original basis. Note that the matrix 
of A’ is the transpose of the matrix of A and not the conjugate transpose. If 
A and B are linear operators on V, then 


(AB) = B" A". 
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Definition 4.24. Suppose G is a matrix Lie group and II is a representation 
of G acting on a finite-dimensional vector space V. Then, the dual repre- 
sentation II* to II is the representation of G acting on V* given by 


T* (g) = [Hg]. 


Similarly, if n is a representation of a Lie algebra g acting on a finite- 
dimensional vector space V, then n* is the representation of g acting on V* 
given by 

n*(X) = —r(X)". 


Note that since the transpose is an order-reversing operation, we cannot 
simply define II*(g) = II(g)*". This would not be a representation; we need 
the inverse in IJ* and the minus sign in 7* in order for the dual represen- 
tations to actually be representations. The dual representation is also called 
contragredient representation. 

The main properties of dual representations are summarized in the follow- 
ing elementary proposition, whose proof is left as an exercise to the reader 
(Exercise 7). 


Proposition 4.25. If II is a representation of a matrix Lie group G, then (1) 
II* is irreducible if and only if TI is irreducible and (2) (II*)* is isomorphic to 
Il. Similar statements apply to Lie algebra representations. 


4.8 Schur’s Lemma 


Let II and © be representations of a matrix Lie group G, acting on spaces V 
and W. Recall that an intertwining map of representations is a linear map 
ġ:V — W with the property that 


for all v € V and all A € G. Schur’s Lemma is an extremely important result 
which tells us about intertwining maps of irreducible representations. Part of 
Schur’s Lemma applies to both real and complex representations, but part of 
it applies only to complex representations. 

It is desirable to be able to state Schur’s Lemma simultaneously for groups 
and Lie algebras. In order to do so, we need to indulge in a common abuse of 
notation. If, say, II is a representation of G acting on a space V, we will refer 
to V as the representation, without explicit reference to II. 


Theorem 4.26 (Schur’s Lemma). 


1. Let V and W be irreducible real or complex representations of a group 
or Lie algebra and let 6: V — W be an intertwining map. Then, either 
o =0 or ¢ is an isomorphism. 
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2. Let V be an irreducible complex representation of a group or Lie algebra 
and let 6: V —> V be an intertwining map of V with itself. Then, ¢ = XI, 
for some X EC. 

3. Let V and W be irreducible complex representations of a group or Lie 
algebra and let ¢,,¢2 : V + W be nonzero intertwining maps. Then, 
bı = Ado, for some AEC. 


Before proving Schur’s Lemma, we obtain two corollaries of it. 


Corollary 4.27. Let II be an irreducible complex representation of a matriz 
Lie group G. If A is in the center of G, then II(A) = XI. Similarly, if m is an 
irreducible complex representation of a Lie algebra g and if X is in the center 
of g (i.e., [X,Y] =0 for all Y € g), then n(X) = XI. 


Proof. We prove the group case; the proof of the Lie algebra case is similar. 
If A is in the center of G, then for all B € G, 


TI(A)II(B) = I(AB) = II(BA) = I(B)I(A). 


However, this says exactly that II(A) is an intertwining map of the space with 
itself. So by Point 2 of Schur’s Lemma, II(A) is a multiple of the identity. O 


Corollary 4.28. An irreducible complex representation of a commutative 
group or Lie algebra is one dimensional. 


Proof. Again, we prove only the group case. If G is commutative, then the 
center of G is all of G, so by the previous corollary II(A) is a multiple of the 
identity for each A € G. However, this means that every subspace of V is 
invariant! Thus, the only way that V can fail to have a nontrivial invariant 
subspace is for it not to have any nontrivial subspaces. This means that V must 
be one dimensional. (Recall that we do not allow V to be zero dimensional.) 

Oo 


We now provide the proof of Schur’s Lemma. 


Proof. As usual, we will prove just the group case; the proof of the Lie algebra 
case requires only the obvious notational changes. 

Proof of Point 1. Saying that ¢ is an intertwining map means ¢(II(A)v) = 
X(A) ($(v)) for all v € V and all A € G. Now, suppose that v € ker(¢). Then, 


(Aw) = X(A)¢(v) = 0. 


This shows that ker ¢ is an invariant subspace of V. Since V is irreducible, we 
must have ker @ = 0 or ker = V. Thus, ¢ is either one-to-one or zero. 

Suppose ¢ is one-to-one. Then, the image of ¢ is a nonzero subspace of W. 
On the other hand, the image of ¢ is invariant, for if w € W is of the form 
(v) for some v € V, then 
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x(A)w = D(A) d(v) = G(HI(A)v). 


Since W is irreducible and image(V) is nonzero and invariant, we must have 
image(V) = W. Thus, ¢ is either zero or one-to-one and onto. 

Proof of Point 2. Suppose now that V is an irreducible complex repre- 
sentation and that ¢ : V — V is an intertwining map of V to itself. This 
means that ¢II(A) = I(A)¢ for all A € G (i.e., that @ commutes with all of 
the II(A)’s). Now, since we are working over an algebraically closed field, ¢ 
must have at least one eigenvalue À € C. Let U denote the eigenspace for ¢ 
associated to the eigenvalue A and let u € U. Then, for each A € G, 


o (II(A)u) = 11(A)4(u) = ATI(A)w. 


Thus, applying II(A) to an element of the A-eigenspace of ¢ yields another 
element of the A-eigenspace. Thus, U is invariant. 

Since À is an eigenvalue, U Æ 0, and so we must have U = V. This means 
that o(v) = Av for all v € V (i.e., that ¢ = AI). 

Proof of Point 3. If ¢2 #0, then by Point 1, %2 is an isomorphism. Now, 
look at ¢1 o $3 1. As is easily checked, the composition of two intertwining 
maps is an intertwining map, so ¢1 0 $9 l is an intertwining map of W with 
itself. Thus, by Point 2, ¢; o os" = AI, whence ¢; = Ado. o 


4.9 Group Versus Lie Algebra Representations 


We know from Chapter 2 (Theorem 2.21) that every Lie group homomorphism 
gives rise to a Lie algebra homomorphism. In particular, this shows that every 
representation of a matrix Lie group gives rise to a representation of the 
associated Lie algebra. In the case of a simply-connected matrix Lie group G, 
we have the converse: A Lie algebra homomorphism gives rise to a Lie group 
homomorphism (Theorem 3.7). This means, in particular, that for a simply- 
connected matrix Lie group G, there is a natural one-to-one correspondence 
between the representations of G and the representations of the Lie algebra 
g. For non-simply-connected groups, there may be Lie algebra representations 
for which there is no associated Lie group representation. 

It is instructive to see how this general theory works out in the case of 
SU(2) (which is simply connected) and SO(3) (which is not). We have shown 
(Theorem 4.9) that every irreducible complex representation of su(2) is equiva- 
lent to one of the representations mm described in Section 4.3. (Recall that the 
irreducible complex representations of su(2) are in one-to-one correspondence 
with the irreducible representations of sI(2;C).) Each of the representations 
Tm Of su(2) was constructed from the corresponding representation Im of the 
group SU(2). Thus, we see, by brute-force computation, that every irreducible 
complex representation of su(2) actually comes from a representation of the 
group SU(2)! This is consistent with the fact that SU(2) is simply connected 
(Proposition 1.14). 
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Let us now consider the situation for SO(3), which is not simply connected. 
(See Section 1.5 and Appendix E.) We know from Exercise 16 of Chapter 2 
that the Lie algebras su(2) and so(3) are isomorphic. In particular, if we take 


the basis 
i 0 01 01 
m=i 4) m= 4( 5 5) B= 4(99) 


for su(2) and the basis 


00 0 001 0-10 
Fy=|00-1)],f= 000)],f3= {1 00 
01 0 —1 00 0 00 


for so(3), then direct computation shows that [E1, E2] = E3, [E2, E3] = Fi, 
and [E3,£,] = Ez, and similarly with the E’s replaced by the F’s. Thus, 
the linear map ¢ : su(2) — so(3) which takes E; to F; will be a Lie algebra 
isomorphism. 

Since su(2) and so(3) are isomorphic Lie algebras, they must have “the 
same” representations. Specifically, if 7 is a representation of su(2), then 7 o 
p7! will be a representation of so(3), and every representation of so(3) is of this 
form. In particular, the irreducible representations of so(3) are precisely of the 
form Om = Tmoo*. We wish to determine, for a particular m, whether there is 
a representation Em of the group SO(3) such that ©, (exp X) = exp(om(X)) 
for all X in so(3). 


Proposition 4.29. Let om = mod! be the irreducible complex representa- 
tions of the Lie algebra so(3) (m > 0). If m is even, then there is a represen- 
tation Sm of the group SO(3) such that &m(exp X) = exp(om(X)) for all X 
in so(3). If m is odd, then there is no such representation of SO(3). 


Note that the condition that m be even is equivalent to the condition that 
dim Vm = m + 1 be odd. Thus, it is the odd-dimensional representations of 
the Lie algebra so(3) which come from group representations. 

In the physics literature, the representations of su(2) = so(3) are labeled 
by the parameter l = m/2. In terms of this notation, a representation of 
so(3) comes from a representation of SO(3) if and only if l is an integer. The 
representations with l an integer are called “integer spin”; the others are called 
“half-integer spin.” If one attempts to construct Xm by the construction in 
the proof of Theorem 3.7, then one finds that not all paths are homotopic 
and the value of the would-be homomorphism ™,, can depend on the path. 
Consider, for example, the path in SO(3) consisting of rotations by angle 2rt 
in the (x, y)-plane, which comes back to the identity when t = 1. It can be 
shown that this path is not homotopic to the constant path. If one defines 
Em along the constant path, then one gets the value Xm (T) = J, as expected. 
If m is odd, however, and one defines ©,, along the path of rotations in the 
(x, y)-plane, then one gets the value Em(I) = —I. This strongly suggests 
(and Proposition 4.29 confirms) that there is no way to define U, (m odd) 
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as a “single-valued” representation of SO(3). An electron, for example, is a 
“spin 3” particle, which means that it is described in quantum mechanics in 
a way that involves the representation o; of so(3). In the quantum mechanics 
literature, one finds statements to the effect that performing a 360° rotation 
on the wave function of the electron gives back the negative of the original 
wave function. This statement reflects that if one attempts to construct the 
nonexistent representation ©, of SO(3), then when defining X; along a path 
of rotations in some plane, one gets that X (I) = —I. 


Proof. Case 1: m odd. In this case, we want to prove that there is no represen- 
tation ©, such that ©,,(exp X) = exp(om(X)) for all X in so(3). Suppose, 
to the contrary, that there is such a Em. Then, take X = 27F;. Computing 
as in Section 2.2, we see that 


1 0 0 
0 cos 2r — sin 2r | = I. 
0 sin2m cos2r 


e?r F 2 


Thus, on the one hand, Em (e?"") = Em(T) = I, whereas, on the other hand, 
Da (e?) = e27Im(F1) 

Let us compute e?"7(¥i), By definition, om(Fi) = ™m(@*(Fi)) = 
Tnm(E1). However, E, = 5H, where, as usual, 


n=(88): 


We know that there is a basis ug, u1, ..., Um for Vm such that ux is an eigen- 
vector for tm(H) with eigenvalue m — 2k. This means that up is also an 
eigenvector for om(F1) = $%m(H), with eigenvalue 4(m — 2k). Thus, in the 
basis {ux}, we have 


zm 
u 
Om(F1) = 
5(—m) 
But we are assuming that m is odd! This means that m — 2k is an odd 
integer. Thus, e?72(—2') — —1, and in the basis {ux} 
erim 
27$ (m—2) 
e2tom(Fi) — ae . -I 

e275 (—m) 


Thus, on the one hand, Em (e?7"!) = Em(I) = I, whereas, on the other hand, 
Em (e271) = e?77m(F1) = —]. This is a contradiction, so there can be no such 
group representation Um. 

Case 2: m is even. We will use the following: 
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Lemma 4.30. There exists a Lie group homomorphism ® : SU(2) > SO(3) 
such that 

(1) ® maps SU(2) onto SO(3), 

(2) ker ® = {I,—I}, and 

(3) the associated Lie algebra homomorphism is the map ¢ : su(2) —> so(3) 
described earlier, namely the one satisfying ¢(E;) = F; (i = 1, 2,3). 


Proof. Exercise 12. o 


Now, consider the representations IIm of SU(2). I claim that if m is even, 
then II,,(—I) = I. To see this, note that 


QnE, __ ri 0 er 
e = exp (7 ae I. 


Thus, IIm(—7) = Im(e?™"1) = e™™ (271). However, as in Case 1, 


erim 


e2?" į (m—2) 
eTm(2TE1) E 


e273 (—m) 


Only, this time, m is even, and so i(m — 2k) is an integer, so that I,,(—J) = 
etm(27 Ei) = J. 

Since II,(—J) = J, In(-—U) = Im(U) for all U € SU(2). According to 
Lemma 4.30, for each R € SO(3), there is a unique pair of elements {U, —U } 
such that ®(U) = 6(—U) = R. Since II,,(U) = Iln(—U), it makes sense to 
define 

Xm(R) = Un(U). 


It is easy to see that Xm is a Lie group homomorphism (hence, a representa- 
tion). By construction, we have 


Im = Em 0 ®. (4.12) 


Now, if om denotes the Lie algebra representation associated to Xm, then 
it follows from (4.12) that 


Tm = Om 0 Ẹ. 
However, the Lie algebra homomorphism ¢ takes E; to Fi, SO Tm = Om ° @, 
Or Om = Tm © $71. Thus, Em is the desired representation of SO(3). o 


4.10 Complete Reducibility 


Definition 4.31. A finite-dimensional representation of a group or Lie alge- 
bra is said to be completely reducible if it is isomorphic to a direct sum of 
a finite number of irreducible representations. 


4.10 Complete Reducibility 119 


Definition 4.32. A group or Lie algebra is said to have the complete re- 
ducibility property if every finite-dimensional representation of it is com- 
pletely reducible. 


As we will see in Chapter 6, the complete reducibility property is a very 
special one that most groups and Lie algebras do not have. If a group or Lie 
algebra does have the complete reducibility property, then the study of its 
representations reduces to the study of its irreducible representations, which 
simplifies the analysis considerably. 


Proposition 4.33. If V is a completely reducible representation of a group 
or Lie algebra, then the following properties hold. 


1. Every invariant subspace of V is completely reducible. 
2. Given any invariant subspace U of V, there is another invariant subspace 
U such that V is the direct sum of U and U. 


The proof of this result is tedious but elementary, requiring only Schur’s 
Lemma and basic linear algebra. Exercises 13 through 18 guide the reader 
through the proof. We will prove later that every finite or compact group has 
the complete reducibility property. The proof shows directly (i.e., without ap- 
pealing to Proposition 4.33) that representations of finite and compact groups 
have Properties 1 and 2 of the proposition, in addition to being completely 
reducible. 


Proposition 4.34. Let G be a matrix Lie group. Let Il be a finite-dimensional 
unitary representation of G, acting on a finite-dimensional real or complex 
Hilbert space V. Then, II is completely reducible. 


Proof. Let V denote the (finite-dimensional!) Hilbert space on which II acts 
and let (-,-) denote the inner product on V. Now, let W C V be an invariant 
subspace. Let W- be the orthogonal complement of W; that is, W+ is the 
space of all vectors v in V such that (v, w) = 0 for all w in W. Then, V is the 
direct sum of W and W+. 

I claim that W+ is also an invariant subspace. To see this, note that since 
I] is unitary, II(A)* = II(A)~! = I1(A7?) for all A € G. Then, for any w € W 
and any v € W+, we have 


(II(A)v, w) = (v, (A) w) = w, (A w) 


In the last step, we have used that w’ = TI(A~!)w is in W, since W is invariant. 
This shows that II(A)v is orthogonal to every element of W (i.e., that II(A)v € 
W+). 

We have established, then, that for unitary representations, the orthog- 
onal complement of an invariant subspace is, again, invariant. Suppose now 
that V is not irreducible. Then, we can find an invariant subspace W that is 
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neither {0} nor V, and we decompose V as W @W+. Then, W and W+ are 
both invariant subspaces and, thus, unitary representations of G in their own 
right. Then, W is either irreducible or it splits as an orthogonal direct sum of 
invariant subspaces, and similarly for W+. We continue this process, and since 
V is finite dimensional, it cannot go on forever. Each time, the dimensions 
of the spaces get smaller, so eventually we must get irreducible pieces—when 
the dimension reaches one if not sooner. Thus, we eventually succeed in de- 
composing V as a direct sum of irreducible invariant subspaces. o 


Proposition 4.35. Every finite group has the complete reducibility property. 


Proof. Suppose that TI is a representation of G, acting on a space V. Choose 
an arbitrary inner product (-,-) on V. Then, define a new inner product (:,-)¢ 
on V by 

(vi, V2)¢ = 5 (II(g)v1, H(g)v2}) : 


gEG 


It is very easy to check that indeed (-, -}g is an inner product. Furthermore, 
if h € G, then 


((h)er, H(h)va)g = D> ((g)H(h)vr, T(g)M(h)v2) 
gEG 


= X (H(gh)vi, H(gh)v2) . 


gEG 


However, as g ranges over G, so does gh. Thus, in fact, 
(I(h)v1, T(h)v2)g = (v1, v2)g; 


that is, II is a unitary representation with respect to the inner product (-,-)¢. 
Thus, II is isomorphic to a direct sum of irreducibles, by Proposition 4.34. O 


There is a variant of the above argument which can be used to prove the 
following result: 


Proposition 4.36. If G is a compact matrix Lie group, G has the complete 
reducibility property. 


The argument below is sometimes called “Weyl’s unitarian trick.” 


Proof. This proof requires the notion of Haar measure. (See, for example, 
Chapter VIII of Knapp (1996) or Section C.4.) 

A left Haar measure on a matrix Lie group G is a nonzero measure u 
on the Borel o-algebra in G with the following two properties: (1) It is locally 
finite (i.e., every point in G has a neighborhood with finite measure); (2) it 
is left-translation invariant. Left-translation invariance means that u (gE) = 
u (E) for all g € G and for all Borel sets E c G, where 
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gE = {geje € E}. 


It is a fact, which we cannot prove here, that every matrix Lie group has a 
left Haar measure and that this measure is unique up to multiplication by 
a constant. (One can analogously define right Haar measure, and a similar 
theorem holds for it. Left Haar measure and right Haar measure may or may 
not coincide; a group for which they do is called unimodular.) 

Now, the key fact for our purpose is that left Haar measure is finite if and 
only if the group G is compact. Suppose, then, that II is a finite-dimensional 
representation of a compact group G acting on a space V. Let (-,-) be an 
arbitrary inner product on V and define a new inner product (-,-)< on V by 


(v1,02)¢ = f, ieaiao ina 


where u is a left Haar measure. Again, it is easy to check that (-,-)¢ is an 
inner product. Furthermore, if h € G, then by the left-invariance of u, 


(II(h)o1, T(h)e2)¢ = fe (ICO) (h)vr, H(g)(h) v2) dy (9) 


= I ((gh)vi, I(gh)v2) du (g) 
= (v1, v2)G . 


So, II is a unitary representation with respect to (-,-)¢, and thus completely 
reducible. Note that the integral defining (-,-)¢ is convergent because yu is 
finite. o 


4.11 Exercises 


poat 


. Prove Point 2 of Proposition 4.5. 

2. Suppose that II is a finite-dimensional unitary representation of a ma- 
trix Lie group G (i.e., V is a finite-dimensional Hilbert space, and II is a 
continuous homomorphism of G into U(V)). Let r be the associated repre- 
sentation of the Lie algebra g. Show that for each X € g, 7(X)* = —7(X). 

3. Show that the adjoint representation and the standard representation are 
equivalent representations of the Lie algebra so(3). Show that the adjoint 
and standard representations of the group SO(3) are equivalent. 

4. Define a vector space with basis ug,uj,...,Um. Now, define operators 

n(H), m(X), and x(Y) by formula (4.10). Verify by direct computation 

that the operators defined by (4.10) satisfy the commutation relations 

[x(H),m(X)] = 2n(X), [m(H),7(¥)] = —2n(¥), and [m(X),7(¥)] = 

n(H). (Thus, 7(H), 7(X), and n(Y) define a representation of sl(2;C).) 

Hint: When dealing with n(Y), treat the case of ug, k < m, separately 

from the case of um. 
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Consider the standard representation of the Heisenberg group, acting on 
C3. Determine all subspaces of C3 which are invariant under the action of 
the Heisenberg group. Is this representation completely reducible? 


. Give an example of a representation of the commutative group R which 


is not completely reducible. 


. Prove Proposition 4.25. 


Hint: There is a one-to-one correspondence between subspaces of V and 
subspaces of V* as follows: Given a subspace W of V, the annihilator of 
W is the subspace of all ¢ in V* such that ¢ is zero on W. See Section 
B.7. 


. Consider the unitary representations II, of the real Heisenberg group. 


Assume that there is some sort of associated representation mp of the Lie 
algebra, which should be given by 


ai(X)f = “Th (e'*) f s 
t= 


(We have not proved any theorem of this sort for infinite-dimensional 
unitary representations.) 

Computing in a purely formal manner (i.e., ignoring all technical issues) 
compute 


010 000 001 
mh 000], mah 001], mah| 000 
000 000 000 


Verify (still formally) that these operators have the right commutation 
relations to generate a representation of the Lie algebra of the real Heisen- 
berg group; that is, verify that on this basis, na| X, Y] = [na(X), ra(Y )]. 

Why is this computation not rigorous? 


. Consider the Heisenberg group over the field Z/p of integers mod p, with 


p prime, namely 


lab 
H, = 01c ||a,b,cE€Z/p 
001 


This is a subgroup of the group GL(3; Z/p) and has p? elements. 

Let V, denote the space of complex-valued functions on Z/p, which is a 
p-dimensional complex vector space. For each nonzero n € Z/p, define a 
complex representation of H, by the formula 


(Inf) (2) = eee D eT RCE Tg = a), re Z/p. 


(These representations are analogous to the unitary representations of 
the real Heisenberg group, with the quantity 27n/p playing the role of A.) 
Note that these representations are defined over C rather than Z/p. 


10. 


11. 


12. 


13. 


14. 


15. 
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(a) Show that for each n, II, is actually a representation of H, and that 
it is irreducible. 

(b) Determine (up to equivalence) all of the one-dimensional complex 
representations of Hp. 

(c) Show that every irreducible complex representation of H, is either one 
dimensional or equivalent to one of the II,’s. 

Prove Theorem 4.15. 

Hints: For existence, choose bases {e;} and {fj} for U and V. Then, 
define a space W which has as a basis {w,; |1 <i < n,1 <j < m}. Define 
b(e:, fj) = wij and extend by bilinearity. For uniqueness, use the universal 
property. 

Recall the spaces Vm introduced in Section 4.3, viewed as representations 
of the Lie algebra sl(2; C). In particular, consider the space V; (which has 
dimension 2). 

(a) Regard Vi ® Vi as a representation of sI(2;C), as in Definition 4.23. 
Show that this representation is not irreducible. 

(b) Now, view Vi ® V; as a representation of sl(2;C) @ sl(2;C), as in 
Definition 4.20. Show that this representation is irreducible. 

Proof of Lemma 4.30. 

Let {E1, E2, E3} be the usual basis for su(2) and let {F}, F2, F3} be the 
basis for so(3) introduced in Section 4.9. Identify su(2) with R? by identi- 
fying the basis {E1, E2, E3} with the standard basis for R. Consider adz,, 
adg,, and adp, as operators on su(2), hence on R. Show that adp, = F; 
for i = 1, 2,3. It follows that ad is a Lie algebra isomorphism of su(2) onto 
so(3). 

Now, consider Ad : SU(2) > GL(su(2)) = GL(3; R). Show that the image 
of Ad is precisely SO(3). Show that the kernel of Ad is {I, —I}. 

Show that Ad : SU(2) - SO(3) is the homomorphism ® required by 
Lemma 4.30. 

Suppose V is a finite-dimensional representation of a group or Lie algebra 
and that W is a nontrivial invariant subspace of V. Show that there exists 
a nontrivial irreducible invariant subspace for V that is contained in W. 
Suppose V is a finite-dimensional representation of a group or Lie algebra 
and that W and W” are invariant subspaces of V with W’ C W. Suppose 
that U is an invariant subspace for V such that V = W’ ® U. Show that 
W AU is an invariant subspace of V and that W = W' ẹ (W AU). 
Suppose that V; and Vz are inequivalent irreducible representations of a 
group or Lie algebra, and consider the associated representation Vi © Vo. 
Regard V; and V2 as subpaces of V; ® V2 in the obvious way. Following 
the outline below, show that V; and V2 are the only nontrivial invariant 
subspaces of V; @ V2. 

(a) First assume that U is a nontrivial irreducible invariant subspace. Let 
Pi : Vi ®V2 > Vi be the projection onto the first factor and let P be the 
projection onto the second factor. Show that P) and Py, are intertwining 
maps. Show that U = Vi or U = Vo. 
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(b) Using Exercise 13, show that V; and V2 are the only nontrivial invari- 
ant subspaces of Vi @ V2. 

Suppose that V is an irreducible finite-dimensional representation of a 
group or Lie algebra, and consider the associated representation V © V. 
Show that every nontrivial invariant subspace U of V @ V is equivalent to 
V and is of the form 


U= {(A1v, A2v)|v € V}, 


for some constants A; and 2, not both zero. 

Suppose V is a completely reducible finite-dimensional representation of 
some group or Lie algebra, in which case V is equivalent to a representation 
of the form 


(VE: EV) E (VE: P2) S- (Vk B- O Vi), 


where Vj,..., Vg are pairwise inequivalent irreducible representations and 
where V; occurs nų times, l = 1,..., k. Suppose U is a nontrivial irreducible 
invariant subspace of V. Show that U is contained in V®---@V; for some 
l and that, as a subspace of V, @ --- ® Vi, U is of the form 


{(A1v, Agu, .--,An,v) |v € Vi}, 


where the A’s are constants that are not all equal to zero. 
Using the results and methods of the five preceding exercises, prove Propo- 
sition 4.33. 


Part II 


Semisimple Theory 


5 


The Representations of SU(3) 


5.1 Introduction 


There is a theory of the representations of semisimple groups and Lie algebras 
(discussed in Chapters 6, 7, and 8) that includes as a special case the repre- 
sentation theory of SU(3). However, I feel that it is worthwhile to examine the 
case of SU(3) separately, before going on to the general theory. I feel this way 
partly because SU(3) is an important group in physics, but chiefly because 
the general semisimple theory is difficult to digest. Considering a nontrivial 
example makes what is going on much clearer. In fact, all of the elements of 
the general theory are present already in the case of SU(3), so we do not lose 
too much by considering at first just this case. 

The main result of this chapter is Theorem 5.9, which states that an ir- 
reducible finite-dimensional representation of SU(3) can be classified in terms 
of its “highest weight.” This is analogous to labeling the irreducible repre- 
sentations Vm of SU(2) or sl(2;C) by the highest eigenvalue of 7,,(H). (The 
highest eigenvalue of 7,(H) in Vm is precisely m.) In the next two chapters, 
we will look at the analogous results for general semisimple Lie algebras. 

The group SU(3) is simply connected (Appendix E), and so the finite- 
dimensional representations of SU(3) are in one-to-one correspondence with 
the finite-dimensional representations of the Lie algebra su(3). Meanwhile, 
the complex representations of su(3) are in one-to-one correspondence with 
the complex-linear representations of the complexified Lie algebra su(3)c = 
sl(3;C) (Proposition 4.6). Moreover, a representation of SU(3) is irreducible 
if and only if the associated representation of su(3) is irreducible, and this 
holds if and only if the associated complex-linear representation of sl(3; C) 
is irreducible. (This follows from Proposition 4.5, Proposition 4.6, and the 
connectedness of SU(3).) Thus, we have the following result. 


Proposition 5.1. There is a one-to-one correspondence between the finite- 
dimensional complex representations II of SU(3) and the finite-dimensional 
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complez-linear representations m of sl(3;C). This correspondence is deter- 
mined by the property that 


I (e*) = en) 


for all X € su(3) C sl(3;C). 
The representation II is irreducible if and only if the representation n is 
irreducible. 


Since SU(3) is compact, Proposition 4.36 tells us that all of the finite- 
dimensional representations of SU(3) are direct sums of irreducible represen- 
tations. The above proposition then implies that the same holds for sl(3;C); 
that is, sl(3;C) has the complete reducibility property. Complete reducibility 
will be an essential ingredient even in the classification of irreducible repre- 
sentations. (See the proof of Proposition 5.16.) 

Moreover, we can apply the same reasoning to the simply-connected group 
SU(2), its Lie algebra su(2), and its complexified Lie algebra sl(2;C). Thus, 
we have established the following. 


Proposition 5.2. Every finite-dimensional representation of sl(2;C) or sl(3;C) 
decomposes as a direct sum of irreducible invariant subspaces. 


We will use the following basis for sl(3; C): 


100 00 0 
H,=|{0-10],H.={010 |, 

000 00-1 

010 000 001 
X,=[000], X.={001], x,={[ 000], 

000 000 000 

000 000 000 
Y,=[({100], Y=[000], ¥3=] 000 

000 010 100 


Note that the span of {H,,X1, Yı} is a subalgebra of sl(3;C) which is 
isomorphic to sI(2;C) (as can be seen by ignoring the third row and the third 
column in each matrix). Similarly, the span of {H2, X2, Y2} is a subalgebra 
isomorphic to sl(2; C). Thus, we have the following commutation relations: 


[Hi, Xi] = 2X1, [H2, X2] = 2X2, 
[A1,¥] = —2Y;, [H2, Y2] = —2Y2, 
[XY] = Hi, [X2 Y2] = Mə. 
We now list all of the commutation relations among the basis elements 


which involve at least one of H, and Hy. (This includes some repetitions of 
the above commutation relations.) 
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[Hı, H] = 0; 

Hı, Xı] = 2X1, [H1, Ya] = —2%,, 

[H2, X1] = -X1, [Ha Y] = Yi; 

r A 5.1 
[H,, X2] = -X2, [M, Y] = Yz, (a) 
[H2, X2] = 2X2, [H2, Y2] = —2Yo; 

[Hi, Xs] = Xs, [H1 Ys] = —Ys, 

H>, X3] = X3, Hə, Y3] = —Y3. 


Finally, we list all of the remaining commutation relations. 


[X1,%i)= H, 
[X2, Yo] p= H3, 
[X3, Y3] = Hı + Ho; 


[X1,X2] = X3, Y1, Y2] = -Y3, 
[X1, Yo] = 0, [X2, Y1] = 0; 


[X1 X3] = 0, M.Y] = 0, 
[X2, X3] = 0, [Y2, Y3] = 0; 


[X2, Y3] = Yi, [X3,Ye]= Xı, 
(Xi, ¥3] = —-Yo, [X3, Y1] = —Xo. 


All of the analysis we will do for the representations of sl(3;C) will be in 
terms of the above basis. From now on, all representations of sI(3;C) will be 
assumed to be finite dimensional and complex linear. 


5.2 Weights and Roots 


Our basic strategy in classifying the representations of sl(3;C) is to simul- 
taneously diagonalize 7(H1) and m(H2). (See Section B.8 for information on 
simultaneous diagonalization.) Since Hı and Hz commute, 7(H,) and m(H2) 
will also commute (for any representation 7) and so there is at least a chance 
that (Hj) and 7(H2) can be simultaneously diagonalized. (Compare Propo- 
sition B.13.) 


Definition 5.3. If (t, V) is a representation of sl(3;C), then an ordered pair 
u = (mı, m2) € C? is called a weight for n if there exists v #0 in V such 
that 


1(H1)v = mv, 
m(H2)v = mv. (5.2) 
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A nonzero vector v satisfying (5.2) is called a weight vector corresponding 
to the weight u. If u = (mı, m2) is a weight, then the space of all vectors v 
satisfying (5.2) (including the zero vector) is the weight space correspond- 
ing to the weight u. The multiplicity of a weight is the dimension of the 
corresponding weight space. 


Thus, a weight is simply a pair of simultaneous eigenvalues for m( H1) and 
(Hz). (See Section B.8 for a discussion of simultaneous eigenvectors and 
eigenvalues.) It is easily shown that equivalent representations have the same 
weights and multiplicities. 


Proposition 5.4. Every representation of sl(3;C) has at least one weight. 


Proof. Since we are working over the complex numbers, 7(H,) has at least one 
eigenvalue mı € C. Let W C V be the eigenspace for 7(H,) with eigenvalue 
my. Since [H1, H2] = 0, 7(H2) commutes with (Hı), and, so, by Proposition 
B.4, 7(H2) must map W into itself. Thus, 7(H2) can be viewed as an operator 
on W. Then, the restriction of 7(H2) to W must have at least one eigenvector 
w with eigenvalue m2 € C and w is a simultaneous eigenvector for n( H1) and 
1(H2) with eigenvalues mı and mz. o 


Now, every representation 7 of sl(3;C) can be viewed, by restriction, as a 
representation of the subalgebra { Hi, X1, Yi} S sl(2;C). Note that even if m 
is irreducible as a representation of sl(3;C), there is no reason to expect that 
it will still be irreducible as a representation of the subalgebra {H1, X1, Yi}. 
Nevertheless, m restricted to {H1, X1, Yı} must be some finite-dimensional 
representation of sl(2; C). The same reasoning applies to the restriction of m 
to the subalgebra {H2, X2, Y2}, which is also isomorphic to sl(2; C). 

Now, recall Theorem 4.12, which tells us that in any finite-dimensional 
representation of sI(2;C), irreducible or not, all of the eigenvalues of 7(H) 
must be integers. Theorem 4.12 has the following corollary. 


Corollary 5.5. If n is a representation of sl(3;C), then all of the weights of 
m are of the form 


H= (mı, m2) 


with mı and m2 being integers. 


Proof. Apply Theorem 4.12 to the restriction of 7 to {H1, X1, Yı } and to the 
restriction of m to {H2, X2, Yo}. o 


Our strategy now is to begin with one simultaneous eigenvector for 7(H;) 
and 7(H2) and then to apply 7(X;) or 7(Y;) and see what the effect is. The 
following definition is relevant in this context. 


Definition 5.6. An ordered pair a = (a1, a2) € C? is called a root if 


1. a, and ag are not both zero, and 
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2. there exists a nonzero Z € sl(3;C) such that 


(Hy, Z] = aZ, 
[H2, Z] = aZ. 


The element Z is called a root vector corresponding to the root a. 


Condition 2 in the definition says that Z is a simultaneous eigenvector 
for ady, and ady,. This means that Z is a weight vector for the adjoint 
representation with weight (a1,a2). Thus, taking into account Condition 1, 
we may say that the roots are precisely the nonzero weights of the adjoint 
representation. Corollary 5.5 then tells us that for any root, both a, and ag 
must be integers, which we can also see directly in (5.3). The commutation 
relations (5.1) tell us what the roots for sl(3;C) are. There are six roots: 


Note that Hı and Hp are also simultaneous eigenvectors for ady, and ady,, 
but they are not root vectors because the simultaneous eigenvalues are both 
zero. Since the vectors in (5.3) together with Hı and Hp form a basis for 
sl(3; C), it is not hard to show that the roots listed in (5.3) are the only roots 
(Exercise 1). These six roots form a “root system,” conventionally called A3. 
For more information, see Chapters 6, 7, and 8. 

It is convenient to single out the two roots corresponding to X; and X3 
and give them special names: 


oz = (1,2). (5.4) 


The roots a; and ag are called the positive simple roots. They have the 
property that all of the roots can be expressed as linear combinations of a1 
and a2 with integer coefficients, and these coefficients are (for each root) 
either all greater than or equal to zero or all less than or equal to zero. This 
is verified by direct computation: 


(2, =!) = Q1, 
(—1,2) = Q2, 
(1,1) = @ı+Q2, 
(—2, 1) = =Q], 
(1 —2) = —a2, 
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(The decision to designate a, and a2 as the positive simple roots is arbitrary; 
any other pair of roots with similar properties would do just as well. We simply 
choose one set of positive simple roots and hold to that choice throughout the 
chapter. See Section 6.8.) 

The significance of the roots for the representation theory of sl(3;C) is 
contained in the following lemma. Although its proof is very easy, this lemma 
plays a crucial role in the classification of the representations of sI(3;C). Note 
that this lemma is the analog of Lemma 4.10, which was the key to the clas- 
sification of the representations of sl(2; C). 


Lemma 5.7. Let a = (aı,a2) be a root and Za a corresponding root vector 
in sl(3;C). Let m be a representation of sl(3;C), u = (mı, m2) a weight for r, 


and v #0 a corresponding weight vector. Then, 


w(Hy)n(Za)u = (mı + 41) 4 (Za)v, 
w(H2)m(Zq)u = (m2 + a2)t(Zq)v. 


Thus, either 7(Zq)v =0 or n(Za)v is a new weight vector with weight 
u +a = (m, +41, M2 + a2). 


Proof. The definition of a root tells us that we have the commutation relation 
|H1, Za] = a1 Za. Thus, 


n(Hi\r(Zav = (n(Za)n(H1) + at (Za)) v 
= T(Za)(mıv) + aT (Zav 
= (mı +a )r(Za w. 
A similar argument allows us to compute 1(H2)m(Zq)v. o 


5.3 The Theorem of the Highest Weight 


We see then that if we have a representation with a weight u = (m1, m2), then 
by applying the root vectors X1, X2, X3, Yı, Yo, and Y3, we can get some new 
weights of the form u + a, where a is the root. Of course, some of the time, 
mt(Zq)v will be zero, in which case u + a is not necessarily a weight. In fact, 
since our representation is finite dimensional, there can be only finitely many 
weights, so we must get zero quite often. By analogy to the classification of the 
representations of sl(2;C), we would like to single out in each representation 
a “highest” weight and then work from there. The following definition gives 
the “right” notion of highest. 


Definition 5.8. Let a; = (2,—1) and ag = (—1,2) be the roots introduced 
in (5.4). Let py and u2 be two weights. Then, pı is higher than uo (or, 
equivalently, uo is lower than mı) if pı — u2 can be written in the form 
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Hı — H2 = aay + bag 


with a> 0 and b> 0. This relationship is written as pı © u2 Or p2 < pi. 
If n is a representation of sl(3;C), then a weight uo for m is said to be a 
highest weight if for all weights u of 7, p < Lo. 


Note that the relation of “higher” is only a partial ordering; that is, one 
can easily have pz; and u2 such that py is neither higher nor lower than p2. 
For example, a; — Q2 is neither higher nor lower than 0. This, in particular, 
means that a finite set of weights need not have a highest element (e.g., the 
set {0,1 — a2} has no highest element). Note also that the coefficients a and 
b do not have to be integers, even if both jz, and u2 have integer entries. For 
example, (1,0) is higher than (0,0) since (1,0) = 2a + łan. 

We are now ready to state the main theorem regarding the irreducible 
representations of sl (3; C), commonly called the theorem of the highest weight. 


Theorem 5.9. 


1. Every irreducible representation m of s\(3;C) is the direct sum of its weight 
spaces; that is, m(H,) and n(H2) are simultaneously diagonalizable in ev- 
ery irreducible representation. 

2. Every irreducible representation of sl(3;C) has a unique highest weight 
Lo, and two equivalent irreducible representations have the same highest 
weight. 

3. Two irreducible representations of sl(3;C) with the same highest weight 
are equivalent. 

4. If n is an irreducible representation of sl(3;C), then the highest weight uo 
ofn is of the form 

Ho = (mı, m2) 


with mı and mg being non-negative integers. 
5. If mı and mo are non-negative integers, then there exists an irreducible 
representation t of sl(3;C) with highest weight po = (m1, m2). 


An ordered pair (m;,mz2) with mı and mg being non-negative integers is 
called a dominant integral element. Theorem 5.9 tell us that the highest 
weight of each irreducible representation of sI(3;C) is a dominant integral 
element and, conversely, that every dominant integral element occurs as the 
highest weight of some irreducible representation. Since (1,0) = 2a1+3Q2 and 
(0,1) = ia + 209, we see that every dominant integral element is higher than 
zero. However, if u has integer coefficients and is higher than zero, this does 
not necessarily mean that u is dominant integral. (For example, a; = (2, —1) 
is higher than zero but is not dominant integral.) 

Figure 5.1 shows the roots and dominant integral elements for sl(3; C). This 
picture is made using the obvious basis for the space of weights; that is, the 
x-coordinate is the eigenvalue of H, and the y-coordinate is the eigenvalue 
of Hz. Once we have introduced the Weyl group (Section 5.6), we will see 
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Fig. 5.1. Roots and dominant integral elements for sl(3;C) (in obvious basis) 


the same picture (Figure 5.2) rendered using a Weyl-invariant inner product, 
which will give a more geometric view of the situation. 

Note the parallels between this result and the classification of the ir- 
reducible representations of sl(2;C): In each irreducible representation of 
sl(2;C), (H) is diagonalizable, and there is a largest eigenvalue of m(H). 
Two irreducible representations of sl(2;C) with the same largest eigenvalue 
are equivalent. The highest eigenvalue is always a non-negative integer, and, 
conversely, for every non-negative integer m, there is an irreducible represen- 
tation with highest eigenvalue m. 

Note, however, that in the classification of the representations of sl(3;C), 
the notion of “highest” does not mean what we might have thought it should 
mean; that is, (m1,m2) = (m1,m2) does not mean mı > nı and mz > na, 
as we might have guessed. (For example, the weight (1,1) is higher than the 
weights (—1,2) and (2,—1).) Nevertheless, the condition on which weights 
can be highest weights is the obvious one: mı and mo must be non-negative 
integers. 

It is possible to obtain much more information about the irreducible rep- 
resentations besides the highest weight. For example, we have the following 
formula for the dimension of the representation with highest weight (m1, m2). 


Theorem 5.10. The dimension of the irreducible representation with highest 
weight (m1, M2) is 


1 

im +1)(m2 + 1)(m, + m2 + 2). 
See Humphreys (1972, Section 24.3). (Humphreys refers to sl(3;C) as A2.) 

We will not prove this formula here. It is a consequence of the Weyl character 

formula, which is discussed in the context of general semisimple Lie algebras 

in Sections 7.4 and 7.6. 
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5.4 Proof of the Theorem 


It will take us some time to prove Theorem 5.9. The proof will consist of a 
series of propositions. 


Proposition 5.11. In every irreducible representation (1, V) of sl(3;C), 7(H1) 
and 1(H2) can be simultaneously diagonalized; that is, V is the direct sum of 
its weight spaces. 


Proof. Let W be the direct sum of the weight spaces in V. Equivalently, W is 
the space of all vectors w € V such that w can be written as a linear combi- 
nation of simultaneous eigenvectors for 7(H,) and 7(H2). Since (Proposition 
5.4) m always has at least one weight, W # {0}. 

On the other hand, Lemma 5.7 tells us that if Za is a root vector corre- 
sponding to the root a, then 7(Z,,) maps the weight space corresponding to u 
into the weight space corresponding to w+ a. Thus, W is invariant under the 
action of all of the root vectors, namely under the action X1, X2, X3, Y1, Yo, 
and Y3. Since W is certainly invariant under the action of Hı and H2, W is 
invariant under all of sI(3;C). Thus, by irreducibility, W = V. o 


Definition 5.12. A representation (t, V) of sl(3;C) is said to be a highest 
weight cyclic representation with weight po = (mı, m2) if there exists 
v #0 in V such that 


1. v is a weight vector with weight uo, 

2. m(X1)v = 1(Xo)u = 0, 

3. the smallest invariant subspace of V containing v is all of V. 
The vector v is called a cyclic vector for r. 
Proposition 5.13. Let (7,V) be a highest weight cyclic representation of 
sl(3;C) with weight po. Then, 

1. v has highest weight uo and 

2. the weight space corresponding to the highest weight uo is one dimensional. 


Before turning to the proof of this proposition, let us record a simple 
lemma that applies to arbitrary Lie algebras and which will be useful also in 
the setting of general semisimple Lie algebras. 


Lemma 5.14. Suppose that g is any Lie algebra and that m is a representation 


of g. Suppose that X1, ..., Xm is an ordered basis for g as a vector space. Then, 
any expression of the form 
1 (Xi, (Xin) +++ (Xin), (5.5) 
can be expressed as a linear combination of terms of the form 
n(Xm) ”n(Xm-1) > ++ (Xy)™ (5.6) 


where in each term kı +---+km < N. Here, the k;’s are non-negative integers 
(zero is allowed!). 
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Proof. Let us think about how this works in the case N < 2. If N = 1, there 
is nothing to do: Any expression of the form 7(X;) is of the form (5.6) with 
ki = 1 and all the other k;’s equal to zero. If N = 2, we consider an expression 
of the form 7(X;)a(X;). If i > j, then this is already of the form (5.6) (with 
most of the k;’s equal to zero). If i < j, then we write 


m(Xi)m(X5) = m(X5) (Xi) + (|X: X5]) 


= 0(X;)a(Xi) + 5 CijkT( Xk), (5.7) 
k=1 


where the c;;;,’s are the structure constants for this basis of g, and the right- 
hand side is now a linear combination of terms of the form (5.6). 

The proof for the general case is by induction on N. Assume, then, that the 
result holds for a product of N or fewer terms and consider an expression of 
the form (5.5) with N +1 factors. By our induction hypothesis, we can assume 
that the last N factors are in the desired form and we need only consider an 
expression of the form 


m(X;)t(Xm)*"t(Xmm—1)P-} + (XK )™ 


with kı +---+ km < N. Now, we move the factor of 1(X;) to the right one 
step at a time until it is in the right spot. Each time we have m(X;)(X;) 
somewhere in the expression we can move the 7(X;) to the right by using 
(5.7). As we move 7(X;) to the right, we will generate multiple commutator 
terms, each of which has one fewer factor and, thus, can be handled by the 
induction hypothesis. Thus, we ultimately get several terms with N —1 factors, 
together with one term having N factors and being of the form (5.6) (once 
m(X;) finally gets to the right spot). o 


We now proceed with the proof of Proposition 5.13. 


Proof. Let v be as in the definition. Consider the subspace W of V spanned 
by elements of the form 


w = n(Y n (Yia) igi r(Y; )u (5.8) 


with each i equal to 1, 2, or 3 and n > 0. (If n = 0, it is understood that 
w in (5.8) is equal to v.) I assert that W is invariant. To see this, it suffices 
to check that W is invariant under each of the basis elements, which we do 
by using the lemma. We take as our basis for sI(3;C) the elements X1, Xo, 
X3, Hı, H2, Yı, Y2, and Y3, in that order. If we multiply an element w by (m 
applied to) some Lie algebra element, the lemma tells us that we can rewrite 
the resulting vector as a linear combination of terms in which the 7(X;)’s 
act first, the 7(H;)’s act second, and the 7(Y;)’s act last, and all of these 
are applied to the vector v. However, v is annihilated by the 7(X;)’s, so any 
term having a positive power of any X; is simply zero and we are left with 


5.4 Proof of the Theorem 137 


the 7(H;)’s and the m(Y;)’s acting on v (in that order). Furthermore, v is an 
eigenvector for 7(H)) and m(H2), so any factors of 7(H;) acting on v can be 
replaced by constants in front of the whole expression. That leaves only factors 
of 7(Y;) applied to v, which means that we are getting a linear combination of 
vectors of the form (5.8). This shows that W is invariant. Since by definition 
W contains v, we must have W = V. 

Now, Y; is a root vector with root —a,, Y> is a root vector with root —ag, 
and Y; is a root vector with root —a,—a2. So, applying Lemma 5.7 repeatedly, 
we see that each element of the form (5.8) is either zero or a weight vector 
with weight uo — n,Q, — N2ea2. Thus, V = W is spanned by v together with 
weight vectors with weights lower than uo. Thus, fo is the highest weight for 
V. 

Furthermore, every element w of W can be written as a w = cv + vı + 
<+- +Um, where v1,...,Um are weight vectors with distinct weights lower than 
po. (If at first the weights are not distinct, then simply combine all the terms 
corresponding to each distinct weight.) It then follows from Proposition B.14 
that such a vector can be a weight vector with weight po only if all the v,’s 
(k = 1,...,m) are zero, in which case, w is a multiple of v. (Apply the 
proposition to (cv — w) +v1 +- - -+Um.) Thus, the weight space corresponding 
to Uo is spanned by v and the corresponding weight space is one dimensional. 

O 


Proposition 5.15. Every irreducible representation of sl(3;C) is a highest 
weight cyclic representation, with a unique highest weight po. 


Proof. Uniqueness is immediate, since by the previous proposition, po is the 
highest weight, and two distinct weights cannot both be highest. 

We have already shown that every irreducible representation is the direct 
sum of its weight spaces. Since the representation is finite dimensional, there 
can be only finitely many weights. It follows that there must exist a weight 
o such that there is no weight u Æ uo with u = po. This says that there is 
no weight higher than po (which is not the same as saying that uo is highest). 
However, if there is no weight higher than yo, then for any nonzero weight 
vector v with weight uo, we must have 


1(X1)v = 1(Xo)u = 0. 


(For otherwise, say, 7(X )v will be a weight vector with weight uo +a > po.) 

Since 7 is assumed irreducible, the smallest invariant subspace containing 
v must be the whole space; therefore, the representation is highest weight 
cyclic. o 


Proposition 5.16. Every highest weight cyclic representation of sl(3;C) is 
irreducible. 


Proof. Let (7, V) be a highest weight cyclic representation with highest weight 
Ho and cyclic vector v. By complete reducibility (Proposition 5.2), V decom- 
poses as a direct sum of irreducible representations 
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vrv.. (5.9) 


By Proposition 5.11, each of the V;’s is the direct sum of its weight spaces. 
Since the weight uo occurs in V, it must occur in some V;. (This follows from 
Proposition B.14.) On the other hand, Proposition 5.13 says that the weight 
space corresponding to uo is one dimensional; that is, v is (up to a constant) 
the only vector in V with weight uo. Thus, V; must contain v. However, then 
V; is an invariant subspace containing v, so V; = V. Thus, there is only one 
term in the sum (5.9), and V is irreducible. o 


Proposition 5.17. Two irreducible representations of sl(3;C) with the same 
highest weight are equivalent. 


Proof. We now know that a representation is irreducible if and only if it is 
highest weight cyclic. Suppose that (n, V) and (ø, W) are two such represen- 
tations with the same highest weight jp. Let v and w be the cyclic vectors 
for V and W, respectively. Now, consider the representation V @ W and let 
U be smallest invariant subspace of V 6 W which contains the vector (v, w). 

By definition, U is a highest weight cyclic representation, therefore ir- 
reducible by Proposition 5.16. Consider the two “projection” maps Pı : 
VOW > V, Pi(v,w) = v and Pp: VOW > W, P(v,w) = w. It is 
easy to check that P, and P, are intertwining maps of representations. There- 
fore, the restrictions of P) and P to U C V @ W will also be intertwining 
maps. 

Now, neither P,|,, nor Pz|y is the zero map (since both are nonzero on 
(v,w)). Moreover, U, V, and W are all irreducible. Therefore, by Schur’s 
Lemma, P,|,, is an isomorphism of U with V, and Pz|y is an isomorphism of 
U with W. Thus, V S U S W. o 


Proposition 5.18. If m is an irreducible representation of sl(3;C), then the 
highest weight of n is of the form 


u = (mı, m2) 
with mı and Mmo being non-negative integers. 


Proof. We already know that all of the weights of 7 are of the form (m1, m2), 
with mı and mz being integers. We must show that if po = (m1, m2) is the 
highest weight, then mı and mg are both non-negative. For this, we again 
use what we know about the representations of sl(2;C). If m is an irreducible 
representation of sI(3;C) with highest weight pọ = (m1, m2) and if v Æ 0 is 
a weight vector with weight uo, then we must have 7(X,)v = 7(X2)v = 0. 
(Otherwise, po would not be highest.) Theorem 4.12, applied to the restric- 
tions of 7 to {H1, X1, Yı } and to {H2, X2, Y2}, shows that mı and mz must 
be non-negative. o 
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Proposition 5.19. If mı and mg are non-negative integers, then there exists 
an irreducible representation of sl(3;C) with highest weight u = (m1, m2). 


Proof. Note that the trivial representation is an irreducible representation 
with highest weight (0,0). So, we need only construct representations with at 
least one of mı and mg positive. 

First, we construct two irreducible representations with highest weights 
(1,0) and (0,1). (These are the so-called fundamental representations.) 
We consider first the standard representation of sl(3;C), acting on CÌ in the 
obvious way. This representation is easily shown to be irreducible. The simul- 
taneous eigenvectors for Hı and H3 in the standard representation are the 
standard basis elements e1, €2, and e3, which have weights (1,0), (—1, 1), and 
(0,—1), respectively. The highest weight for the standard representation is 
(1, 0). 

To construct an irreducible representation with weight (0,1), we modify 
the standard representation. Specifically, we define 


n(Z) = -Z*" (5.10) 


for all Z € sI(3;C). Using the fact that (AB) = Bt" At", it is easy to check 
that 
di [Fazal =z [-z7, 7 5] 5 


so that 7 is really a representation. (This is isomorphic to the dual of the 
standard representation, as defined in Section 4.7.) It is also easily checked 
that this representation is irreducible. The simultaneous eigenvectors for Hı 
and Hə in this representation are again e1, e2, and e3, but this time with 
weights (—1, 0), (1,—1), and (0,1). The highest weight for this representation 
is (0,1). 

Let (m1, V1) denote C? acted on by the standard representation and let 
vı denote a weight vector corresponding to the highest weight (1,0). (So, 
vı = (1,0,0).) Let (22, V2) denote C? acted on by the representation (5.10) and 
let v2 denote a weight vector for the highest weight (0,1). (So, v2 = (0,0, 1).) 
Now, consider the representation 


VY OV, @-:-@Vi @V8V28-:-@V2, 


where Vı occurs m; times and V} occurs m2 times. Note that the action of 
sl (3; C) on this space is 


Z= (m(Z) @I@---@1) 
+(1@m(Z)@1@---@1)+---+(1@-:-@1@nm(Z)). (5.11) 


Let 7m,,m, denote this representation. 
Consider the vector 


Umi m = V1 @ V1 @ +++ GO V1 @ ve @ V2 +++ B Vg. 
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Then, applying (5.11) shows that 


Tm ,me2 X2)Um,,mz = 0. (5.12) 


Now, the representation 7m,,m, is not irreducible (unless (m1, m2) = (1,0) 
or (0,1)). However, if we let W denote the smallest invariant subspace con- 
taining the vector Um,,m., then, in light of (5.12), W will be highest weight 
cyclic with highest weight (m1, m2). Therefore, by Proposition 5.16, W is 
irreducible with highest weight (m1, m2). 

Thus, W is the representation we want. 


We have now completed the proof of Theorem 5.9. 


5.5 An Example: Highest Weight (1, 1) 


To obtain the irreducible representation with highest weight (1,1), we are 
supposed to take the tensor product of the irreducible representations with 
highest weights (1,0) and (0, 1), and then extract a certain invariant subspace. 
Let us establish some notation for the representations (1,0) and (0,1). In the 
standard representation, the weight vectors for 


1 00 00 0 
H,=|[0-10),H#,=|{01 0 
0 00 00-1 


are the standard basis elements for C?, namely e1, e2, and e3. The correspond- 
ing weights are (1,0), (—1,1), and (0,—1). The highest weight is (1,0). 
Recall that 


000 000 
Y,=]100],¥%= {000 
000 010 
Thus, 
Yı (e1) = e2, Yole1) = 0, 
Yi(ez) = 0, Yo(e2) = es, (5.13) 
Yi(e3) = 0, Yo(e3) = 0 


Now, the representation with highest weight (0,1) is the representation 
n(Z) = —Z*", for Z € sl(3;C). Let us define 


Deya 


for all Z € sl(3;C). Thus, 7(Z) = Z. Note that 
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= f-100\  /000 
=| 010],H2=| 0-10 
0 00 001 


The weight vectors are again e1, €2, and e3, with weights (—1,0), (1,—1), and 
(0, 1), respectively. The highest weight is (0,1). 
Define new basis elements 


fi = 63, 
fo = —€2, 
fg = e1. 
Then, since 
SA Gel. 3 00 0 
Y,={000],Y={]00-1], 
000 00 0 
we have 


Yi(fs) = 0, Ya(fs) = 
Note that the highest weight vector is fı = e3. 

So, to obtain an irreducible representation with highest weight (1,1), we 
are supposed to take the tensor product of the representations with high- 
est weights (1,0) and (0,1), and then take the smallest invariant subspace 
containing the vector e1 ® fı. In light of the proof of Proposition 5.13, this 
smallest invariant subspace is obtained by starting with e; @ fı and applying 
all possible combinations of Yı and Yz. 

Recall that if 7; and 72 are two representations of the Lie algebra sI(3; C), 
then 


Yi(fo) = fa, Yo(fe) = 0, ` (5.14) 
0 


(mı ® m2) (Y1) = m1 (Y1) 9 I +18 m2(Y1), 
(mı Q T2) (Y2) = mı(Y2) &I+ I 8 m2(Y9). 


In our case, we want 7(Y;) = Y; and 72(Y;:) = Y;. Thus, 


(mı 972) (%1) =Y% 89I +I9Y;, 
(mı 8 m2) (Y2) = Yo @I +I @Y3. 


The actions of Y; and Y; are described in (5.13) and (5.14). 

Note that mı Q 72 is not an irreducible representation. The representation 
Tı Q T2 has dimension 9, whereas the smallest invariant subspace containing 
eı Q fı has, as it turns out, dimension 8. 

So, it remains only to begin with e1 Q fi, apply Yı QI +1 @Y,; and Y2 Q 
I +1 ® Y2 repeatedly until we get zero, and then figure out what dependence 
relations exist among the vectors we get. This calculation is contained in the 
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following chart. Here, there are two arrows coming out of each vector. Of 
these, the left arrow indicates the action of Y; @I +I @Y;, and the right arrow 
indicates the action of Y2 9I +I &Yz. To save space, I have omitted the tensor 
product symbol and written, for example, e2 fo instead of e2 ® fo. 


efi 
v N 
ezfi ei f2 
ae + 4 N 
0 e3 fı + e2 f2 e2 f2 + eifs 0 
L 4 N 
e2 f3 2e3 f2 2e2 f3 e3 f2 
rae 1 N L | TE 
0 e3f3 2e3 f3 0 2e3 f3 e3f3 0 


A basis for the space spanned by these vectors is e1 f1, e2 f1, e1 f2, es fi + 
e2 f2, e2 f2 +€1 f3, €2 f3, €3 f2, and e3 f3. (These vectors are linearly independent 
and every vector listed above is a constant multiple of one of these.) So, 
the dimension of this representation is 8; it is (isomorphic to) the adjoint 
representation. 

The weights for this representation are (1,1), (—1,2), (2,—1), (0,0), 
(1, —2), (—2, 1), and (—1, —1). Each weight has multiplicity 1 except for (0, 0), 
which has multiplicity 2 because e3 fı + e2 f2 and e2 f2 + e1 f3 are both weight 
vectors with weight (0,0). 


5.6 The Weyl Group 


There is an important symmetry to the representations of sl(3;C) involving 
something called the Weyl group. (Our treatment will follow the compact 
group approach, which is apparently different from the Lie algebra approach, 
but ultimately equivalent to it.) To understand the idea behind the Weyl group 
symmetry, let us observe that the representations of sI(3;C) are, in a certain 
sense, invariant under the adjoint action of SU(3). What I mean by this is the 
following. Let m be a finite-dimensional representation of sl(3;C) acting on a 
vector space V and let II be the associated representation of SU(3) acting on 
the same space. For any A € SU(3), we can define a new representation 74 of 
sl(3;C), acting on the same vector space V, by setting 


wa(X) = 1(AXA7?). 


Since the adjoint action of A on sl(3;C) is a Lie algebra automorphism, this 
is, again, a representation of sl(3;C). This new representation is easily seen 
to be equivalent to the original representation; direct calculation shows that 
II(A) is an intertwining map between (7r, V) and (m4, V). We may say, then, 
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that the adjoint action of SU(3) is a symmetry of the set of equivalence classes 
of representations of sl(3; C). 

Now, we have analyzed the representations of sl(3;C) by simultaneously 
diagonalizing the operators m( Hı) and m(H2). Of course, this means that any 
linear combination of n( H1) and 2(H2) is also simultaneously diagonalized. So, 
what really counts is the two-dimensional subspace h of sl(3;C) spanned by Hy 
and Hy. (This space is called a Cartan subalgebra of sl(3;C). See Chapter 
6 for more information.) Now, in general, the adjoint action of A € SU(3) 
will not preserve the space h and so the equivalence of m and 74 does not (in 
general) tell us anything about the weights of 7. However, there are elements 
A in SU(3) for which Ad, does preserve b. These elements make up the Weyl 
group for SU(3) and (as we shall see below) give rise to a symmetry of the 
set of weights of any representation 7. So, we may say that the Weyl group is 
the “residue” of the adjoint symmetry of the representations (discussed in the 
previous paragraph) that is left after we focus our attention on the subspace 
b of sl(3;C). 


Definition 5.20. Let h be the two-dimensional subspace of sl(3;C) spanned by 
Hy and H3. Let Z be the subgroup of SU(3) consisting of those A € SU(3) such 
that Ad4(H) = H for all H € h. Let N be the subgroup of SU(3) consisting 
of those A € SU(3) such that Ad4(H) is an element of h for all H in. 


It is a straightforward exercise (Exercise 8) to verify that Z and N are 
actually subgroups of SU(3) and to verify that Z is a normal subgroup of N. 
This leads us to the definition of the Weyl group. 


Definition 5.21. The Weyl group of SU(3), denoted W, is the quotient 
group N/Z. 


We can define an action of W on 6 as follows. For each element w of W, 
choose an element A of the corresponding equivalence class in N. Then for H 
in h we define the action w- H of w on H by 


w- H = Ada(H). 


To see that this action is well defined, suppose B is another element of the 
same equivalence class as A. Then B = AC with C € Z and, thus, 


Adp(H) = AdaAdc(H) = Ad4 (H), 


by the definition of Z. It is easily seen that W is isomorphic to the group of 
linear transformations of h that can be expressed as Ad, for some A € N. 
The following proposition will allow us to compute W explicitly. 


Proposition 5.22. The group Z consists precisely of the diagonal matrices 
inside SU(3), namely the matrices of the form 
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e o 0 
A=| 0&8 0 (5.15) 


for 0 and ¢ in R. The group N consists of precisely those matrices A € SU(3) 
such that for each k = 1,2,3, there exist | € {1,2,3} and 0 € R such that 
Aep = ee). Here, €1,€2,€3 is the standard basis for C3. 

The Weyl group W = N/Z is isomorphic to the permutation group on 
three elements. 


Proof. Suppose A is in Z, which means that A commutes with all elements 
of h. Then, certainly, A must commute with Hı. Now, the matrix Hı has 
eigenvalues 1, —1, and 0. The corresponding eigenspaces are the span of e1, 
the span of e2, and the span of e3. Since A commutes with H4, it must preserve 
each of these eigenspaces (Proposition B.4). This means that Ae, must be a 
multiple of ex for each k = 1, 2,3. This is the same as saying that A is diagonal. 
If A is also to be unitary and have determinant 1 then it must be of the form 
in the proposition. Conversely, any matrix of the form (5.15) does indeed 
commute not only with H, but also with Hə and, thus, with every element of 
h. So, Z consists precisely of the form (5.15). 

Now, suppose that A is in N. Then, AH, A`! must be in h and there- 
fore must be diagonal. Now, H has eigenvectors e€1, e2, and e3 with distinct 
eigenvalues 1, —1, 0. Then, AH AT! will have eigenvectors Ae;, Aez, and Ae3 
with the same eigenvalues, 1, —1, 0. Since the eigenvalues of AH AT? are dis- 
tinct, the only eigenvectors it has are multiples of Ae, multiples of Aez, and 
multiples of Ae3. (This would not be the case if AH A`! had a repeated eigen- 
value.) On the other hand, AH A`! is diagonal, which means it has e1, e2, and 
e3 as eigenvectors. The only way these two descriptions of the eigenvectors 
of AHA! can agree is if each Ae; is a constant multiple of some e;. The 
constant must have absolute value 1 if A is unitary. 

Conversely, suppose A is in SU(3) and A takes each ex to e?e. Then, the 
eigenvectors for AH, A~! will still be e1, e2, and e3 (but with the eigenvalues 
possibly in a different order) and so AH, A`}? will be diagonal. Furthermore, 
since H; has trace zero, AH, AT} will also have trace zero. However, h consists 
of all diagonal 3 x 3 matrices with trace zero, and so this shows that AH, A~! 
is in h. The same argument shows that AH2A™~? is in b. 

Now, let us think about what Ad, looks like as a linear transformation of 
b, for A in N. For k = 1,2,3, let a(k) be the element of {1,2,3} such that A 
maps ex to a multiple of e,(;). Since A is invertible, the map k > a(k) must 
be a permutation of the set {1,2,3}. Now each element H of h is a diagonal 
matrix, which means that H has e1, e2, and e3 as eigenvectors; the diagonal 
entries of H are precisely the corresponding eigenvalues 1, A2, and A3. Then, 
AH A`! has €o(k) aS an eigenvector with eigenvalue \,. This means that the 
diagonal entry of H that was originally in the kt spot of H is now in the 
a(k)*® spot. 
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We see, then, that A acts on h by permuting the diagonal entries of each 
H € b according to the permutation ø. Thus, the group of linear transforma- 
tions of h of the form Ady, A € N, is isomorphic to the permutation group 
on three elements. This group of linear transformations is (isomorphic to) the 
Weyl group W = N/Z. 

Note that although each entry of A maps each e; to some constant multiple 
e'® of €o(k)» the action of Ad, on h depends only on the value of o(k) and 
not on the constants e’*. This reflects that if one multiplies any A € N on 
the right by some B € Z, then Adap(H) = AdyAdp(H) = Ada(#), since 
by the definition of Z, the adjoint action of B on § is just the identity. o 


In the case of SU(3), it is possible to identify the Weyl group with a certain 
subgroup of N, instead of as the quotient group N/Z. See Exercise 9. Exercise 
10 asks one to verify by direct calculation that the action of a particular 
element A of N is as described in the above proof. 

We want to show that the Weyl group is a symmetry of the weights of any 
finite-dimensional representation of sI(3;C). To understand this, we need to 
adopt a less basis-dependent view of the weights. We have defined a weight as a 
pair (m1, M2) of simultaneous eigenvalues for 7(H,) and 1(H2). However, if a 
vector v is an eigenvector for 7(H,) and 7(H2) then it is also an eigenvector for 
n(H) for any element H of the space h spanned by H; and H2. Furthermore, 
the eigenvalues must depend linearly on H since if H and J are any two 
elements of h and (Hw = àv and 7(J) = Agu, then 


n(aH + bJ)u = (an(H) + br(J))v 
= (ar, + bA2)v. 


So, we may make the following basis-independent notion of a weight. 


Definition 5.23. Let h be the subspace of sl(3;C) spanned by Hı and Ho and 
let r be a finite-dimensional representation of s\(3;C) acting on a vector space 
V. A linear functional p € h* is called a weight for x if there exists a nonzero 
vector v in V such that 

w(H)v = p(A)v 


for all H in h. Such a vector v is called a weight vector with weight u. 


So, a weight is just a collection of simultaneous eigenvalues of all the 
elements H of h, which, as we have noted, must depend linearly on H and 
which, therefore, define a linear functional on h. Since Hı and Ho span b, the 
linear functional u is determined by the value of (Hı) and (H2), and thus 
our new notion of weight is equivalent to our old notion of a weight as just a 
pair of simultaneous eigenvalues of 7(H,) and 7(H2). The reason for adopting 
this basis-independent approach is that the action of the Weyl group does not 
preserve the basis {H,, H2} for b. 

The Weyl group is (or may be thought of as) a group of linear transforma- 
tions of h. This means that W acts linearly on h, and we denote this action as 
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w- H. We can define an associated action on the dual space h* as in the defi- 
nition of the dual representation in Chapter 4. Thus, for u € h* and w € W, 
we define w - u to be the element of h* given by 


(w p) (H) = p(w™} - H). (5.16) 


We now come to the main point of the Weyl group from the point of view 
of representation theory, namely that the weights of any representation are 
invariant under the action of the Weyl group. 


Theorem 5.24. Suppose that n is any finite-dimensional representation of 
sl(3;C) and that u € h* is a weight for x. Then, for any w € W, w » p is also 
a weight of n, and the multiplicity of w - is the same as the multiplicity of 
L- 


Proof. Suppose that u is a weight for a representation (7, V) of sl(3;C) and 
suppose that v is a weight vector with weight u. Then, let T be the associated 
representation of the simply-connected group SU(3) and consider the vector 
TI(A)v, for A € N. We want to show that II(A)w is, again, a weight vector. 
So, we compute 


n(H)U(A)u = (AUA) r (EIA) 
= II(A)n(A7!HA)u 
= u(A7*HA)II(A) 


v. 

Here, we have used that A is in N, which guarantees that A~!HA is, again, 
in h. However, A~'HA is nothing but w~! - H, where w is the Weyl group 
element represented by A. Thus, by (5.16), w(A~1HA) = (w - p)(H). This 
shows that II(A)v is, again, a weight vector, with weight w- and thus w- u 
is, again, a weight for (m, V). The same sort of reasoning shows that II(A) 
is an invertible map of the weight space associated to the weight u onto the 
weight space with weight w - u, whose inverse is II(A)~!. This means that the 
two weight spaces have the same dimension and, therefore, u and w - u have 
the same multiplicity. o 


Note that since the roots are nothing but the nonzero weights of the adjoint 
representation, this result tells us that the roots are invariant under the action 
of the Weyl group. In order to visualize the action of the Weyl group, it is 
convenient to identify h* with h by means of an inner product on h that is 
invariant under the action of the Weyl group. Recall that h is a subspace of 
the space of diagonal matrices, and we use on the space of diagonal matrices 
the inner product obtained by identifying with C? in the obvious way. (This 
inner product is, if one prefers, the restriction to the diagonal matrices of the 
Hilbert-Schmidt inner product (A, B} = trace(A*B). See Section B.6.) Since 
the Weyl group acts by permuting the diagonal entries, this inner product 
(restricted to the subspace h) is preserved by the action of W. 
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We now use this inner product on h* to identify h. Given any element a 
of h, the map H > (a, H} is a linear functional on § (i.e., an element of h*). 
Every linear functional on h can be represented in this way for a unique a in 
b (Section B.7). We will now simply identify each linear functional with the 
corresponding element of h. Thus, we will now regard a weight for (m, V) as 
a nonzero element of with the property that there exists a nonzero v in V 
such that 
w(H)v = (a, H) v (5.17) 


for all H in §. This is the same as Definition 5.23 except that, now, a lives in 
h and we write (a, H) instead of a(H) on the right. The roots, being weights 
for the adjoint representation, are viewed in a similar way. 

Now that the roots and weights live in b instead of h*, we can use the 
above inner product on þh. Furthermore, it can be shown (Exercise 11) that 
under our identification of h* with h, the action of W on h* (described in 
(5.16)) coincides with the adjoint action of W on b. 

We are now ready to begin calculating. I claim that with our new point of 
view the roots a, and ag are identified with the following elements of b: 


1 0 
a= -1 > Q2= 1 
0 —1 


To check this, we note that these matrices are indeed in h since the diag- 
onal entries sum to zero. Then, direct calculation shows that (ai, H1) = 2, 


(a1, H2} = —1 and (a2, Hi) = —1, (a2, H2) = 2, in agreement with our earlier 
definition (5.4) of a; and ag. So, then, we can compute the lengths and an- 
gles as lal? = (1,01) = 2, lall? = (Q2,Q@2) = 2, and (&1, @2) = —1. This 


means that (with respect to this inner product) a, and a2 both have length 
V2 and the angle 0 between them satisfies cos @ = —1/2, so that @ = 120°. 

We now consider the dominant integral elements, which are the possible 
highest weights of irreducible representations of sI(3;C). With our new point 
of view, these are the elements p of h such that (u, Hı) and (u, H2) are non- 
negative integers. We begin by considering the fundamental weights u, and 
u2 defined by 


(m, Hy) = 1, (u2, Ay) = 0, 
(mı, Hə) = 0, (u2, H2) =1 


A little trial and error shows that these can be expressed in terms of a, and 
Q2 as follows: 


3 en 

= ~Q — ‘ 

Hı gel re 
1 2 

H2 = 3701 + 5Q2. 


3 3 
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Plugging in the expressions for a; and a2, we get 


1 1 

m=| 4 , m=t 3 
ae ae 
3 3 


An elementary calculation then shows that uı and u2 each have length V6/3 
and that the angle between them is 60°. The set of dominant integral elements 
is then precisely the set of linear combinations of pı and u2 with non-negative 
integer coefficients. Note that uı + u2 = a1 + Q2, an observation that helps 
in drawing Figure 5.2 below. 

We are now finally ready to draw some pictures. Figure 5.2 shows the 
same information as Figure 5.1, namely, the roots and the dominant integral 
elements, but now drawn relative to a Weyl-invariant inner product. We draw 
only the two-dimensional real subspace of h consisting of those elements pu 
such that (u, Hı) and (u, H2) are real, since all the roots and weights have this 
property. In this figure, the arrows indicate the roots, the black dots indicate 
dominant integral elements (i.e., points u such that (u, Hi) and (u, H2) are 
non-negative integers), and the triangular grid indicates integral elements (i.e., 
points u such that (u, Hı) and (u, H2) are integers). 


manna 


LERLY 
RRO 


Fig. 5.2. Roots and dominant integral elements for sl(3;C) (using Weyl-invariant 
inner product) 


Let us see how the Weyl group acts on Figure 5.2. Let (1,2,3) denote the 
cyclic permutation that takes 1 to 2 to 3 to 1, and let wa ,2,3) denote the cor- 
responding Weyl group element (Exercise 10). Then, w(1,2,3) acts by cyclically 
permuting the diagonal entries of each element of H. Thus, wi1,2,3) takes a1 
to œz and takes ag to —(a1 + a2). This action is a 120° rotation, counter- 
clockwise in Figure 5.2. Next, let (1,2) be the permutation that interchanges 
1 and 2 and let w(1,2) be the corresponding Weyl group element. Then, w(1,2) 
acts by interchanging the first two diagonal entries of each element of H, and 


5.7 Weight Diagrams 149 


thus takes a, to —a, and takes a2 to a1 + a2. This corresponds to a reflec- 
tion about the line perpendicular to a,. The reader is invited to calculate 
the action of the remaining Weyl group elements. The Weyl group consists of 
six elements: the identity, clockwise and counterclockwise rotations by 120°, 
and three reflections—about the line perpendicular to a1, about the line per- 
pendicular to a2, and about the line perpendicular to a, + ag. This is the 
symmetry of an equilateral triangle centered at the origin, as indicated in 
Figure 5.3. 


ARB, 
oS RNY 


Fig. 5.3. Weyl group for sl(3;C) 


5.7 Weight Diagrams 


In this section, we consider weight diagrams for sl(3;C) (i.e., pictures of the 
weights of various representations of sl(3;C)). (These weight diagrams should 
not be confused with Dynkin diagrams, which are discussed in Chapter 8.) 
The action of the Weyl group is critical to understanding which weights arise 
in a representation with a given highest weight uo. 


Definition 5.25. Let v,,...,Un be a finite collection of points in a vector 
space V. The convex hull of v1,...,Un is the set of all vectors in V that can 
be expressed as 


C1 Uy + C202 + +++ + CnUn, 
where C1,...,Cn are non-negative real constants satisfying cı +: + cn =1. 


Equivalently, the convex hull is the smallest convex subset of V containing 
all of the points vj,...,Un- 

To make the weight diagrams, we make use of the following result. As in 
(5.17), we regard the weights (and so also the roots) as elements of h. 


Theorem 5.26. Suppose that m is an irreducible representation of sl(3;C) 
with highest weight uo. Then, an element u of h is a weight of the represen- 
tation n if and only if the following two conditions are satisfied: 
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1. u is contained in the convex hull of the orbit of po under the Weyl group. 
2. uo — u is expressible as a linear combination of a, and ag with integer 
coefficients. 


Let us think first about Condition 1. The Weyl group of SU(3) has six 
elements and the orbit of a “generic” point in will consist of six points. 
These six points will form the vertices of a hexagon. The simplest way to see 
how this will work out is first to apply to uo a reflection (say, about the line 
perpendicular to œ) and then to apply 120° clockwise and counterclockwise 
rotations to the resulting pair of points to get a total of six points. Suppose, 
however, that one starts with a dominant integral element jo that is on the 
edge of the set of all dominant integral elements (these are the elements of 
the form (m,0) or (0, mg) in our old view of the weights). Then, ji is left 
invariant by either the reflection about the line perpendicular to a; or by the 
reflection about the line perpendicular to ag. In that case, the orbit of po is a 
triangle (unless 4g = 0, in which, case the orbit is just a single point). Finally, 
the convex hull of the orbit of uo will be a filled-in triangle or hexagon (or a 
single point if uo = 0). 

Let us think now about Condition 2. Condition 2 implies that u must be 
an integral element (i.e., that (u, Hı) and (u, H2) are integers), since po, a1, 
and az all have this property. However, not every integral element will satisfy 
Condition 2. Suppose, for example, that po = (1,0) and u = (0,0). Then, 
ko~ b= Za + laz and the coefficients are not integral. So, (0,0) is not a 
weight of the irreducible representation with highest weight (1,0). In most 
cases, there will be integral elements contained in the convex hull of the orbit 
of uo that are not weights of the representation with highest weight uo. 

I will give only the main idea of the proof of Theorem 5.26. It is not too 
hard to show that Conditions 1 and 2 are both necessary conditions for the 
weights of the representation with highest weight uo. See Exercises 13 and 
14. Showing that the conditions are sufficient is an sl(2;C) argument and 
makes use of the fact that there can be no “gaps” in the eigenvalues of H in 
a finite-dimensional representation of sl(2;C): If a non-negative integer k is 
an eigenvalue for H in some representation, then so are k — 2, k —4,...,—k. 
(One starts with the orbit of uo under the Weyl group and then uses the just- 
mentioned result, applied to various sl(2;C) subalgebras of sl(3;C), to “fill in” 
all the elements satisfying Conditions 1 and 2.) 

Theorem 5.26 tells us which weights occur in a given representation of 
sl(3;C) but not what the multiplicities of the weights are. It can be shown 
that the multiplicities obey the following simple pattern. The weights occur 
in “rings” in which the rings toward the outside are hexagons and the rings 
toward the inside are triangles. The weights in the outermost ring have mul- 
tiplicity 1. The multiplicities then increase by 1 each time one moves inward 
one ring, until the rings become triangles, at which point the multiplicities 
stabilize. The situation for the multiplicities in other semisimple Lie algebras 
is more complicated—see Section 7.6. 
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Figures 5.4, 5.5, and 5.6 show the weights and multiplicities for three 
irreducible representations, with highest weights (1,2), (2,2), and (4,0), re- 
spectively. In each figure, a black dot indicates a weight of the representation, 
with the highest weight being circled. A number next to a dot indicates the 
multiplicity of the corresponding weight. A dot without a number indicates a 
weight of multiplicity one. In Figure 5.4, the dashed lines extending from the 
highest weights indicate the boundary of the set of points that are lower than 
(1, 2). The dimensions of these representations are 15, 27, and 15, respectively, 
as can be computed either from the dimension formula (Theorem 5.10) or by 
adding up the multiplicities of all the weights. 


5.8 Exercises 


1. Show that the roots listed in (5.3) are the only roots. 

2. Let m be an irreducible finite-dimensional representation of sl(3;C) acting 
on a space V and let 7* be the dual representation to 7, acting on V*, as 
defined in Section 4.7. Show that the weights of 7* are the negatives of 
the weights of r. 

Hint: Choose a basis for V in which both 7(H;) and 7(H2) are diagonal. 

3. As in Exercise 2, let m be an irreducible representation of sl(3;C) and let 
a* be the dual representation to 7. Show that if m has highest weight 
(mı, m2), 7 has highest weight (m2, mı). 

Hint: Establish this first in the cases (mı, m2) = (1,0) and (m1, m2) = 
(0,1). 

4. Consider the adjoint representation of sl(3;C) as a representation of 
sl(2; C) by restricting the adjoint representation to the subalgebra spanned 
by X1, Y1, and Hı. Decompose this representation as a direct sum of irre- 
ducible representations of sl(2;C). Which representations occur and with 
what multiplicity? 

5. Following the method of Section 5.5, work out the representation of sl(3; C) 
with highest weight (2,0), acting on a subspace of C? @ C%. Determine all 
the weights of this representation and their multiplicity (i-e., the dimension 
of the corresponding weight space). Verify that the dimension formula 
(Theorem 5.10) holds in this case. 

6. Consider the nine-dimensional representation of sl(3; C) considered in Sec- 
tion 5.5, namely the tensor product of the representations with highest 
weights (1,0) and (0,1). Decompose this representation as a direct sum 
of irreducibles. Do the same for the tensor product of two copies of the 
irreducible representation with highest weight (1,0). (Compare Exercise 
5.) 

7. Let Vm denote the space of homogeneous polynomials on C of degree m. 
By imitating Section 4.3, construct a representation of SU(3) acting on 
Vm. Find the weights for the associated action of sl(3;C) on Vi and V2. 


10. 


11. 


12. 


13. 


14. 
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Show that V; and Vz are irreducible representations (of SU(3) or sl(3; C)). 
What are the highest weights of these representations? 


. Show that Z and N (defined in Definition 5.20) are subgroups of SU(3). 


Show that Z is a normal subgroup of N. 


. For each permutation o of {1,2,3}, let A, be the matrix such that Ae, = 


sgn(o) €o(k); where sgn( o) is the sign of the permutation a, equal to 1 for 
even permutations and equal to —1 for odd permutations. Show that the 
matrices A, form a subgroup of N that is isomorphic to W. 

Consider the matrix A in SU(3) given by 


001 
A=|/100}], 
010 


which maps e; to e2, €2 to e3, and e3 to e1. Let H be an arbitrary element 
of h and let A1, Ae, and \3 be the diagonal entries of H (which must sum 
to zero). Compute by hand AH A`! and verify that this is related to H 
as described in Section 5.6, namely that Àı gets shifted into the second 
spot, A2 gets shifted into the third spot, and A3 gets shifted into the first 
spot. 

Show that under the identification of h* with h described in Section 5.6, 
the action of W on h* (described in (5.16)) coincides with the adjoint 
action of W on b. 

Regard the Weyl group as a group of linear transformations of h. Show 
that —I is not an element of the Weyl group. Which representations of 
sl(3;C) have the property that their weights are invariant under —I? 
Using the proof of Proposition 5.13, show that every weight u of an irre- 
ducible representation with highest weight po must satisfy Condition 2 of 
Theorem 5.26. 

This exercise asks one to “prove” geometrically the following result. Let 
Ho be a dominant integral element and u any integral element. If w - u is 
lower than po for all w € W, then p is contained in the convex hull of the 
W-orbit of uo. 

To see why this result is true, make a picture of a typical dominant integral 
element fig and its W-orbit. Now, take a typical point u that is not in 
the convex hull of the orbit of uo and draw its W-orbit. Show that the 
W-orbit of u contains at least one point that is not lower than po. 

This result (along with the invariance of the weights under the action of 
the Weyl group) shows that Condition 1 of Theorem 5.26 is a necessary 
condition for u to be a weight of the representation with highest weight 


Ho- 


6 


Semisimple Lie Algebras 


In this chapter, we will consider a class of Lie algebras (the complex semisim- 
ple ones) that are sufficiently similar to sl(3;C) that their representations can 
be described, similarly to sl(3;C), by a “theorem of the highest weight.” We 
will not come to the representations themselves until the next chapter; in this 
chapter, we develop the structures needed to state the theorem of the highest 
weight. Although this chapter could be understood simply as a description 
of the structure of semisimple Lie algebras, without any mention of repre- 
sentation theory, I think it is helpful to have the representations in mind. 
The representation theory, especially in light of our experience with sl(3;C), 
motivates the notions of Cartan subalgebras, roots, and the Weyl group. 

We will give three equivalent characterizations of semisimple Lie algebras 
(and there are several other commonly used ones). The first characterization is 
the one that we will take as our definition and which presumably accounts for 
the term “semisimple”: A semisimple Lie algebra is one which is isomorphic 
to a direct sum of simple Lie algebras. The second characterization is that 
a complex Lie algebra is semisimple if and only if it is isomorphic to the 
complexification of the Lie algebra of a compact simply-connected group. This 
characterization shows, for example, that sl(n;C) S su(n)c is semisimple. The 
third characterization is that a Lie algebra g is semisimple if and only if it 
has the complete reducibility property, that is, if and only if every finite- 
dimensional representation of g decomposes as a direct sum of irreducibles. 

Before getting into the details of semisimple Lie algebras, let us briefly 
outline what our strategy will be in classifying their representations and what 
structures we will need to carry out this strategy. We will look for commut- 
ing elements H;,...,H, in our Lie algebra that we will try to simultaneously 
diagonalize in each representation. We should find as many such elements as 
possible, and if they are going to be simultaneously diagonalizable in every rep- 
resentation, they must certainly be diagonalizable in the adjoint representa- 
tion. This leads (in basis-independent language) to the definition of a Cartan 
subalgebra. The nonzero sets of simultaneous eigenvalues for ady,,...,adx,. 
are called roots and the corresponding simultaneous eigenvectors are called 
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root vectors. The root vectors will serve to raise and lower the eigenval- 
ues of 7(H),...,7(H,) in each representation 7. We will also have the Weyl 
group, which is an important symmetry of the roots and also of the weights in 
each representation. Finally, we will introduce the notion of positive roots, 
in terms of which the notion of “highest weight” will be defined. 

One crucial part of the structure of semisimple Lie algebras is the existence 
of certain special subalgebras isomorphic to sl(2;C). Several times over the 
course of this chapter and the next one, we will make use of our knowledge 
of the representations of sl(2;C). In particular, if X, Y, and H are the usual 
basis elements for sl(2;C), then we will use repeatedly that the eigenvalues 
of o(H) in any finite-dimensional representation of sI(2;C) must be integers 
(Theorem 4.12). In view of the importance of this result, it is worthwhile now 
to recall why this is so. From the Lie algebra point of view, we began with 
an eigenvector for o(H) and then used o(X) and o(Y) to raise and lower 
the eigenvalues for o(H) in increments of 2. Since the representation is finite 
dimensional, this chain of eigenvalues must terminate in both directions. The 
calculations of Section 4.4 (especially Lemma 4.11) show that this can happen 
only if the highest eigenvalue m of o(H) is a non-negative integer. In that case, 
all of the other eigenvalues of o(H) are of the form m — 2k and, so, are also 
integers. 

From the group point of view, we recall that because SU(2) is simply 
connected, for each finite-dimensional representation o of sl(2;C) = su(2)c 
there is a representation © of SU(2) such that U(exp X) = expo(X) for all 
X in su(2). We note that 27iH is in su(2) and that exp(2riH) = I in SU(2). 
Thus, 

exp(2rio(H)) = U(exp(27iH)) = X(I) = I. 


This can happen only if the eigenvalues of o(H) are integers. After all, if À is 
an eigenvalue for o(H), then exp(27iA) is an eigenvalue for exp(27io(H)) = I, 
so exp(2rià) = 1 and \ must be an integer. 


6.1 Complete Reducibility and Semisimple Lie Algebras 


Recall (Section 4.10) that a group or Lie algebra is said to have the complete 
reducibility property if every finite-dimensional representation of it decom- 
poses as a direct sum of irreducible invariant subspaces. Recall also (Proposi- 
tion 4.36) that a connected compact matrix Lie group always has this property. 
It follows that the Lie algebra of a compact simply-connected matrix Lie group 
also has the complete reducibility property, since, in that case, there is a one- 
to-one correspondence between the representations of the compact group and 
its Lie algebra. Since there is a one-to-one correspondence between the repre- 
sentations of a real Lie algebra and the complex-linear representations of its 
complexification, we see also that if a complex Lie algebra g is isomorphic to 
the complexification of the Lie algebra of a compact simply-connected group, 
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then g has the complete reducibility property. In Chapter 5, we applied this 
reasoning to sl(2; C) (the complexification of the Lie algebra of SU(2)) and to 
sl(3;C) (the complexification of the Lie algebra of SU(3)). 

In this chapter, we will study complex Lie algebras that are isomorphic 
to the complexification of the Lie algebra of a compact simply-connected ma- 
trix Lie group. As it turns out, such Lie algebras are precisely the complex 
semisimple Lie algebras, which we now define. Although we will mostly be 
concerned with complex semisimple Lie algebras, there is a brief discussion of 
real semisimple Lie algebras at the end of this section. 


Definition 6.1. If g is a complex Lie algebra, then an ideal in g is a complex 
subalgebra h of g with the property that for all X in g and H in h, we have 
[X, H] in b. 


Note that the definition of an ideal is stronger than that of a subalgebra. 
For a subalgebra, we require only that the bracket of two elements of the 
subalgebra remain in the subalgebra. For an ideal, we require that the bracket 
of an element of the ideal with any element of g be, again, in the ideal. Any 
Lie algebra g has two “trivial” examples of ideals: g itself and the zero ideal 


b = {0}. 


Definition 6.2. A complex Lie algebra g is called indecomposable if the 
only ideals in g are g and {0}. A complex Lie algebra g is called simple if g 
is indecomposable and dim g > 2. 


The term “indecomposable” is not a standard one, but since there does 
not seem to be any standard term for this concept, I have invented one. 
Note that the only indecomposable Lie algebras that are not simple are the 
one-dimensional ones and that any two one-dimensional Lie algebras are iso- 
morphic, since all brackets must be zero. A one-dimensional Lie algebra has 
no nontrivial subalgebras and, hence, certainly no nontrivial ideals. Thus one- 
dimensional Lie algebras are indecomposable but not simple. 

There is an analogy between finite-dimensional Lie algebras and finite 
groups. Subalgebras in the Lie algebra setting are the analogs of subgroups in 
the finite group setting, and ideals in the Lie algebra setting are the analogs 
of normal subgroups in the finite group setting. In this analogy, the one- 
dimensional Lie algebras (which are precisely the Lie algebras having no non- 
trivial subalgebras) are the analogs of the cyclic groups of prime order (which 
are precisely the groups having no nontrivial subgroups). However, there is 
a discrepancy in terminology: cyclic groups of prime order are called simple 
but one-dimensional Lie algebras are not called simple. This terminological 
convention is important to bear in mind in the following definition. 


Definition 6.3. A complex Lie algebra is called reductive if it is isomorphic 
to a direct sum of indecomposable Lie algebras. A complex Lie algebra is called 
semisimple if it isomorphic to a direct sum of simple Lie algebras. 
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Note that a reductive Lie algebra is a direct sum of indecomposable al- 
gebras, which are either simple or one-dimensional commutative. Thus, a re- 
ductive Lie algebra is one that decomposes as a direct sum of a semisimple 
algebra (coming from the simple terms in the direct sum) and a commutative 
algebra (coming from the one-dimensional terms in the direct sum). 

We will assume (in the spirit of this book) that the complex semisimple Lie 
algebras we study are given to us as subalgebras of some gl(n; C). There is no 
loss of generality in this since by Ado’s Theorem every finite-dimensional Lie 
algebra has a faithful finite-dimensional representation. In fact, for semisimple 
Lie algebras, the adjoint representation is always faithful, as is easily shown. 
(See Exercise 1.) 


Proposition 6.4. A complex Lie algebra g is reductive precisely if the adjoint 
representation is completely reducible. 


Proof. An ideal is precisely an invariant subspace for the adjoint represen- 
tation, as a moment’s thought will confirm. So, if the adjoint representation 
decomposes as a direct sum of irreducibles, then g decomposes (as a vector 
space) as gı ®-:: gm, where each gẹ is an ideal and where gẹ contains no 
ideals of g other than gẹ itself and {0}. Now, if X € gẹ and Y € gı (k #1), 
then [X,Y] = 0, since [X,Y] must be in both gą and gı (because both gą and 
gı are ideals). This means that g must be the direct sum (in the Lie algebra 
sense) of the g,’s. 

Now, I claim that each gẹ must be indecomposable when viewed as a Lie 
algebra in its own right. After all, suppose that h is an ideal in gẹ. Then, b 
is also an ideal in g. (The commutator of an element of gẹ with an element 
h will remain in h by assumption. The commutator of an element of h with 
an element of g;, | 4 k, will be zero.) This means, by our assumptions on the 
gx’s, that h = {0} or h = gg. 

So, if the adjoint representation decomposes as a sum of irreducibles, then 
g is reductive. Conversely, if g is reductive, then g is a direct sum of inde- 
composable algebras, which are then irreducible invariant subspaces for the 
adjoint representation. o 


Corollary 6.5. The complezification of the Lie algebra of a connected com- 
pact matrix Lie group is reductive. 


This follows from the above proposition and Proposition 4.36 (stating that 
connected compact groups have the complete reducibility property). Note 
that the Lie algebra of a compact Lie group may be only reductive and not 
semisimple. For example, the Lie algebra of S! is one dimensional and, thus, 
not semisimple. 


Theorem 6.6. A complex Lie algebra is semisimple if and only if it is isomor- 
phic to the complezification of the Lie algebra of a simply-connected compact 
matrix Lie group. 
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I will not prove this result. Nevertheless, let us discuss the ideas behind it. 
One direction is fairly easy, namely proving that if g is the complexification of 
the Lie algebra of a compact simply-connected group K, then g is semisimple. 
We have already shown that g is reductive, even if K is not simply connected. 
Thus g = gı ® g2, with gı semisimple and gz commutative. It can be shown 
that the Lie algebra £ of K decomposes as t = £; © t2, where gı = t, + if; 
and go = t2 + it). Then K decomposes as Kı x K2, where Kı and Ko are 
simply connected and where K3 is commutative. However, a simply-connected 
commutative Lie group is isomorphic to R”, which is noncompact for n > 1. 
Thus, the compactness of K means that t2 = {0}, in which case g2 = {0} and 
g = gı is semisimple. 

For the other direction, given a complex semisimple Lie algebra, we must 
find the correct real form whose corresponding simply-connected group is 
compact. For this, see Varadarajan (1974). 


Definition 6.7. If g is a complex semisimple Lie algebra, then a compact 
real form of g is real subalgebra £ of g with the property that every X ing 
can be written uniquely as: X = Xı +iXz with Xı and Xə in £ and such that 
there is a compact simply-connected matrix Lie group Kı such that the Lie 
algebra €; of Kı is isomorphic to £. 


Theorem 6.6 tells us that every complex semisimple Lie algebra has a 
compact real form. The compact real form is not unique, but it is “unique up 
to conjugation,” as explained in Section 6.10. 

Note that K itself is not necessarily simply connected. Consider, for exam- 
ple, the complex Lie algebra so(3; C) C gl(3;C) which is the complexification 
of so(3). We note that so(3) is isomorphic to su(2), which is the Lie algebra 
of the compact simply-connected group SU(2). This means that so(3; C) is 
semisimple and that so(3) is a compact real form of so(3;C). However, the 
subgroup of GL(3;C) whose Lie algebra is so(3) is the group SO(3), which is 
not simply connected. So, in this case, we have K = SO(3) and Kı = SU(2). 


Proposition 6.8. Let g be a complex semisimple Lie algebra. If g is a sub- 
algebra of gl(n;C) and £ is a compact real form of g, then the connected Lie 
subgroup K of GL(n;C) whose Lie algebra is É is compact. 


Proof. The definition of a compact real form is that there is a simply- 
connected compact matrix Lie group Kı whose Lie algebra €; is isomorphic 
to £. Let @: €; — t C gl(n;C) be a Lie algebra isomorphism. By Theorem 3.7, 
there is an associated Lie group homomorphism ® : Kı — GL(n; C), and let K 
be the image of this homomorphism. Since the image of a compact set under 
a continuous map is compact, K is compact (and hence closed). Furthermore, 
since the image of tı is t, Proposition 3.16 tells us that K is the connected 
Lie subgroup of GL(n; C) with Lie algebra €. o 


A corollary of Theorem 6.6 is the following result, which we will make use 
of in the next chapter. 
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Corollary 6.9. Every complex semisimple Lie algebra has the complete re- 
ducibility property. 


This holds because the representations of g are in one-to-one correspon- 
dence with the representations of K,, and compact groups have the complete 
reducibility property (Theorem 4.36). Actually, it is not hard to prove (Exer- 
cise 2) that among complex Lie algebras, only the semisimple ones have the 
complete reducibility property. Thus, complete reducibility is sometimes taken 
as the definition of semisimplicity for Lie algebras. For an algebraic proof of 
complete reducibility of semisimple Lie algebras, see Humphreys (1972). 

Up to now, we have considered only complex semisimple Lie algebras, since 
these are the ones whose representations we will consider. (Working over C 
instead of R allows us to find nice bases for our Lie algebras.) Nevertheless, we 
can define the terms ideal, indecomposable, simple, reductive, and semisimple 
for real Lie algebras in precisely the same way as for the complex case. 


Proposition 6.10. If g is a real Lie algebra and gc its complexification, then 
g is semisimple if and only if gc is semisimple. 


I will not prove this result—see Varadarajan (1974). Note that the propo- 
sition does not hold if the word “semisimple” is replaced by “simple.” If gc 
is simple, then g must be simple (since if h were a nontrivial ideal in g, then 
hc would be a nontrivial ideal in gc), but the converse of this statement does 
not hold. For example, it can be shown that the six-dimensional real Lie alge- 
bra so(3, 1) is simple. However, its complexification so(3, 1; C) is isomorphic 
to so(4;C), which, in turn, is isomorphic to sl(2;C) @ sl(2;C), and so the 
complexification is not simple. 

As a consequence of the above proposition and Theorem 6.6, we see that 
the real Lie algebra of a compact simply-connected group is semisimple. How- 
ever, not every real semisimple Lie algebra is of this sort. Consider, for exam- 
ple, sl(n;R). Of course, SL(n; R) is noncompact, and there can be no compact 
simply-connected Lie group whose Lie algebra is sl(n;R), since such a group 
would then be the universal cover of SL(n;R), and the universal cover of a 
noncompact group is noncompact. 

Nevertheless, sl(n; R) is semisimple, by Proposition 6.10, because its com- 
plexification is sl(n;C), which is also the complexification of su(n), which is 
the Lie algebra of a compact simply-connected group. So, a group G whose Lie 
algebra g is real semisimple should be thought of as being “almost compact.” 
This is to be understood not in any topological sense but rather in the sense 
that G has a compact simply-connected “cousin” K with the property that 
tc is isomorphic to gc. For example, SL(n;R) has SU(n) as its cousin and 
SO(n, k) has Spin(n +k) (the simply-connected double cover of SO(n + k)) as 
its cousin for n + k > 3. 

When working with finite-dimensional representations, one can always ex- 
tend the representation to the complexification, and so it is easier to work 
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only with complex semisimple Lie algebras. This will be our approach in the 
next chapter. 

There are several other equivalent characterizations of semisimple Lie al- 
gebras, for example, that the Lie algebra have no nonzero solvable ideals or 
that the Killing form (B(X, Y) = trace(adxady )) be nondegenerate. 


6.2 Examples of Reductive and Semisimple Lie Algebras 


Let us consider some examples of Lie algebras that are reductive or semisimple, 
starting with the complex case. The following table lists the complex Lie 
algebras that we have encountered in this book that are either reductive or 
semisimple. An entry of “reductive” in the table means actually “reductive 
but not semisimple.” 


sl(n;C) (n > 2) semisimple 
so(n; C) (n > 3) semisimple 
so(2; C) reductive 


gl(n;C) (n > 1) reductive 


sp(n; C) (n > 1) semisimple 


To verify the results of this table, we use Theorem 6.6. First, sl(n;C) is the 
complexification of su(n), which is the Lie algebra of the compact simply- 
connected group SU(n). Next, so(n; C) is the complexification of so(n), which 
is the Lie algebra of the compact group SO(n). Unfortunately, SO(n) is not 
simply connected. However, so(n) is also the Lie algebra of Spin(n), which is 
compact and simply connected for all n > 3 (Brécker and tom Dieck (1985)). 
Meanwhile, so(2;C) is one-dimensional commutative and thus reductive but 
not semisimple. 

Next, gl(n;C) is the complexification of u(n), which is the Lie algebra 
of the compact group U(n). This means that gl(n;C) is reductive. However, 
the center of a semisimple Lie algebra must be trivial, and the center of 
gl(n;C) is nontrivial, containing all the multiples of the identity. Note that 
gl(n;C) S sl(n;C) p C, where sl(n; C) is semisimple and C is one-dimensional 
commutative. So, gl(n;C) is reductive but not semisimple. Finally, sp(n; C) is 
the complexification of sp(n), which is the Lie algebra of the compact simply- 
connected group Sp(n). 

All of the above-listed semisimple algebras are actually simple, except 
for so(4; C), which is isomorphic to sl(2;C) @ sl(2;C). In Chapter 8, we will 
discuss the classification of complex simple Lie algebras. It turns out that 
every complex simple Lie algebra is isomorphic to one of sl(n;C), so(n; C) 
(n # 4), sp(n;C), or to one of five “exceptional” Lie algebras conventionally 
called Go, F3, Ee, Ey, and Eg. 

We now make a table of the real Lie algebras we have encountered in 
this book that are either reductive or semisimple. Again, “reductive” means 
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actually “reductive but not semisimple.” In each case, the complexification of 
the listed Lie algebra is isomorphic to one of the complex Lie algebras in the 
above table. Note that there can be several nonisomorphic real Lie algebras 
whose complexifications are isomorphic to the same complex semisimple Lie 
algebra. 


su(n) (n > 2) semisimple 
so(n) (n > 3) semisimple 
so(2) reductive 

sp(n) (n > 1) semisimple 
so(n, k) (n + k > 3) semisimple 
so(1, 1) reductive 

sp(n;R) (n>1) semisimple 
sl(n;R) (n > 2) semisimple 


gi(n;R) (n > 1) reductive 


The other Lie algebras we have examined in this book, namely the Lie al- 
gebras of the Heisenberg group, the Euclidean group, and the Poincaré group, 
are neither reductive nor semisimple. 


6.3 Cartan Subalgebras 


We now begin to develop the structure that we will use (in the next chapter) 
in describing the representations of complex semisimple Lie algebras. These 
same structures are used to give a classification of semisimple Lie algebras, as 
discussed in Chapter 8. See Section 6.9 for how these structures come out in 
the case g = sl(n;C). 


Definition 6.11. If g is a complex semisimple Lie algebra, then a Cartan 
subalgebra of g is a complex subspace h of g with the following properties: 


1. For all Hı and H3 in b, [Hi, H2] = 0. 
2. For all X in g, if |H, X]=0 for all H inh, then X is inb. 
3. For all H in h, ady is diagonalizable. 


Condition 1 says that h is a commutative subalgebra of g. Condition 2 
says that h is a mazimal commutative subalgebra (i.e., not contained in any 
larger commutative subalgebra). Condition 3 says that each ady (H € b) is 
diagonalizable. Since the H’s in h commute, the ady’s also commute, and thus 
they are simultaneously diagonalizable. (It is a standard result in linear alge- 
bra that any commuting family of diagonalizable matrices is simultaneously 
diagonalizable; see Section B.8.) 

Of course, the definition of a Cartan subalgebra makes sense in any Lie 
algebra, semisimple or not. However, if g is not semisimple, then g may not 
have any Cartan subalgebras. (See Exercise 3.) Even in the semisimple case 
we must prove that a Cartan subalgebra exists. 


Plate 1: The Ag root system 


Plate 2: The B3 root system 


Plate 3: The C3 root system 


Plate 4: Dominant integral elements for A3 
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Proposition 6.12. Let g be a complex semisimple Lie algebra, let § be a com- 
pact real form of g, and let t be any mazimal commutative subalgebra of €. 
Define h C g to be h =t + it. Then, h is a Cartan subalgebra of g. 


Note that € (or any other Lie algebra) contains a maximal commutative 
subalgebra. After all, let t; be any one-dimensional subspace of &. Then, tı 
is a commutative subalgebra of €. If ti is maximal, then we are done; if not, 
then we chose some commutative subalgebra tz properly containing tı. Then, 
if tg is maximal, we are done, and if not, we chose a commutative subalgebra 
ts properly containing tz. Since £ is finite dimensional, this process cannot go 
on forever and we will eventually get a maximal commutative subalgebra. 


Proof. It is clear that h is a commutative subalgebra of g. We must first show 
that h is maximal commutative. So, suppose that X € g commutes with every 
element of h, which certainly means that it commutes with every element of 
t. Then, write X = X, +1X 2 with X, and X> in £. Then, for H in t, we have 


[H, X: +iXə] = [H, Xı] + i[H, Xo] = 0, 


where [H, X,] and [H, X2] are in € (since € is a real subalgebra). However, 
since every element of g has a unique decomposition as an element of € plus 
an element of it, we see that [H, X,] and [H, X2] must separately be zero. 
Since this holds for all H in t and since t is maximal commutative, we must 
have X, and Xə in t, which means that X = X, + 7X9 is in b. This shows 
that h is maximal commutative. 

We assume, as usual, that g is given as a subalgebra of gl(n;C). Let K 
be the subgroup of GL(n; C) whose Lie algebra is €. According to Proposition 
6.8, K is compact. So, by the averaging method of Section 4.10, there exists a 
real-valued inner product (-,-) on € that is invariant under the adjoint action 
of K. This inner product can then be extended to a complex-valued inner 
product on g, also denoted (-,-), that is invariant under the adjoint action 
of K and that takes real values on £. This means that for each A in K, Ada 
is a unitary operator on g (with respect to the inner product (-,-)). It then 
follows that for each X in £, adx : g — g is skew self-adjoint. (This is by the 
same argument that shows that the Lie algebra of U(n) consists of skew-self- 
adjoint matrices.) Thus, in particular, ady is skew for each H in t, and a skew 
operator on a finite-dimensional complex inner product space is automatically 
diagonalizable. (See Appendix B.) 

Finally, if H is any element of h = t+ it, then H = Hı + iH2, with Hı 
and Hp in t. Since Hı and Hz commute, ady, and adp, also commute, and, 
therefore, ady will be diagonalizable (since a linear combination of commuting 
diagonalizable operators is diagonalizable). This shows that h is a Cartan 
subalgebra. o 


It is possible to prove that every Cartan subalgebra of g arises as in Propo- 
sition 6.12 (for some compact real form € and some maximal commutative 
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subalgebra t of €) and also that Cartan subalgebras are “unique up to con- 
jugation.” (See Section 6.10 for more precise statements.) In particular, all 
Cartan subalgebras of a given complex semisimple Lie algebra have the same 
dimension. In light of this result, the following definition makes sense. 


Definition 6.13. If g is a complex semisimple Lie algebra, then the rank of 
g is the dimension of any Cartan subalgebra. 


6.4 Roots and Root Spaces 


From now on we assume that we have chosen a compact real form € of g 
and a maximal commutative subalgebra t of &, and we consider the Cartan 
subalgebra h = t + it. We assume also that we have chosen (as in the proof 
of Proposition 6.12) an inner product on g that is invariant under the adjoint 
action of K and that takes real values on €. 


Definition 6.14. A root of g (relative to the Cartan subalgebra h) is a 
nonzero linear functional a on h such that there exists a nonzero element 
X of g with 

[H, X] = a(H)X 


for all H inh. 
The set of all roots is denoted R. 


The condition on X says that X is an eigenvector for each ady, with 
eigenvalue a(H). Note that if X is actually an eigenvector for each ady with 
H in b, then the eigenvalues must depend linearly on H. That is why we 
insist that a be a linear functional on b. (See Section B.8.) So, a root is 
just a (nonzero) collection of simultaneous eigenvalues for the adj’s. Note 
any element of h is a simultaneous eigenvector for all the ady’s, with all 
eigenvalues equal to zero, but we only call a a root if a is nonzero. Of course, 
for any root a, some of the a(H)’s may be equal to zero; we just require that 
not all of them be zero. 


Proposition 6.15. If a is a root, a(H) is imaginary for all H int. 


Proof. By the proof of Proposition 6.12, there exists an inner product on g 

such that ady is skew self-adjoint for all H in t. The eigenvalues of a skew 

operator are necessarily imaginary and each a(#) is an eigenvalue for ady. 
o 


Note that the set of linear functionals on h that are imaginary on t forms 
a real vector space whose real dimension equals the complex dimension of b. 
If t* denotes the space of real-valued linear functionals on t, then the roots 
are contained in it* C h*. 
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Definition 6.16. If a is a root, then the root space gq is the space of all X 
in g for which |H, X] = a(H)X for all H in bh. An element of ga is called a 
root vector (for the root a). 

More generally, if a is any element of h*, we define ga to be the space of 
all X in g for which |H, X] = a(H)X for all H in h (but we do not call ga a 
root space unless a is actually a root). 


Taking a = 0, we see that go is the set of all elements of g that commute 
with every element of b. Since h is a maximal commutative subalgebra, we 
conclude that go = b. If a is not zero and not a root, then gq = {0}. 

Now, since h is commutative, the operators ady, H € b, all commute. 
Furthermore, by the definition of a Cartan subalgebra, each ady, H € b, 
is diagonalizable. It follows (Proposition B.13) that the ady’s, H € b, are 
simultaneously diagonalizable. As a result, g can be decomposed as the direct 
sum of h and the root spaces ga. (Here, we make use of Proposition B.14.) 
Thus, we have established the following. 


Proposition 6.17. The Lie algebra g can be decomposed as a direct sum as 
follows: 
g9=5©8 B Ba. 
aER 


This means that every element of g can be written uniquely as a sum of 
an element of h and one element from each root space ga- 


Proposition 6.18. For any a and B in b*, [ga,9a] C ga+6- 


More explicitly, this means that if X is in gq and Y is in gg, then [X,Y] 
is in ga+g- In particular, if X is in gẹ and Y is in g_a, then [X,Y] is in b. 
Furthermore, if X is in ga, Y is in gg, and a+ £ is neither zero nor a root, 
then [X,Y] = 0. 


Proof. We use that ady is a derivation, which means that [H,[X,Y]] = 
[[H, X], Y] + [X,[H,Y]]. This identity is equivalent to the Jacobi identity 
(Exercise 22 from Chapter 2). So, if X is in ga and Y is in gg then we have 
for all H in b, 


|H, [X, Y]] = [a(H)X, Y] +[X, B(A)Y] 
= (a(H) + B(H))[X,Y]. 
This shows that [X,Y] is in gag. o 
Proposition 6.19. 


1. If a € h* is a root, then so is —a. 
2. The roots span h*. 
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Proof. Since a is a root, there exists a nonzero element X of g such that 
[H, X] = a(H)X for all H in h and thus, in particular, for all H in t. Then, 
X can be written uniquely as X = X, + iX with X; and Xo in £. So, for H 
in t, we have 

[H, X] = (A, Xıl + ilH, Xa], 


where [H, X;] and [H, X9] are in £, since € is a real subalgebra. Recall that 
a(H) is imaginary for H in t. So, write a(H) = ia, with a real. Then, 


[H, X] = iaX = —aXə + iaXy. 


Since each element of g has a unique decomposition as a sum of an element 
of € and an element it, we must have [H, X1] = —aX9 and [H, X2] = aX. 
Now, put Y = Xı — iX2. Then, 


[H,Y] = [H, Xı] — i[H, X2] 
= —Q@ Xə — iaXı 
= —ia(Xı —iX2) 
= —iaY. 


Thus, [H,Y] = —a(H)Y for all H in t, and thus also for all H in b. This 
shows that —a is another root. 

For Point 2, suppose that the roots did not span h*. Then, there would 
exist a nonzero H € h such that a(H) = 0 for all a € R. Then, [H, Hi] = 0 for 
all Hı in h, and also [H, X] = a(H)X =0 for X in ga. Thus, by Proposition 
6.17, H would commute with all elements of g; that is, H would be in the 
center of g. However, the center of a semisimple Lie algebra must be zero 
(Exercise 1), and we have a contradiction. o 


We now come to the first substantial result about roots and root spaces, 
the proof of which will occupy the remainder of this section. 


Theorem 6.20. 


1. If œ is a root, then the only multiples of a that are roots are a and —a. 

2. If œ is a root, then the root space ga is one dimensional. 

3. For each root a, we can find nonzero elements Xa in ga, Ya in g-a, and 
Ha in h such that 


[Ha, Xa] = 2Xq, 
[Ha, Ya] = 2Ya; 
[Xa Ya] = Ha 


The element Ha is unique (i.e., independent of the choice of Xa and Ya). 
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Point 3 of the theorem tells us that Xa, Ya, and Ha span a subalgebra of 
g isomorphic to sl(2;C). The elements Ha of § given in Point 3 of the theorem 
are called the co-roots. Their properties are closely related to the properties 
of the roots themselves. 

In preparation for the proof of Theorem 6.20, we choose (as in the proof 
of Proposition 6.12) an inner product (-,-) on g that is invariant under the 
adjoint action of K and that takes real values on £. This means that for each 
X in £, adx is skew; that is, 


(adx Y, Z) = —(Y,adxZ) (6.1) 


for all X in £ and all Y and Z in g. Now, suppose X is any element of g with 
X = Xı +iX2 (Xi, X2 € £) and define 


X* = -X +iX2. (6.2) 


The motivation for this definition is that if g = sl(n;C) and € = su(n), then 
X* is the usual adjoint of X. 
It follows from (6.1) that for any X in g (not necessarily in €), we have 


(adxY, Z) = (Y,adx-Z). (6.3) 


Furthermore, looking at the proof of Proposition 6.19, we see that if X is in 
the root space ga, then X* is in the root space g-a. (The element Y in the 
proof of that proposition is —X*.) 

We now come to a simple but crucial calculation. 


Lemma 6.21. Suppose that X is in ga, Y is in g-a, and H is in h. Then, 
[X,Y] is in h and 
(LX, Y], H) = a(H) (Y, X*) , 


where X* is defined by (6.2). 


Proof. That [X,Y] is in h follows from Proposition 6.18. Then, using (6.3), 
we compute that 


(X, Y], H) = (adxY, H) = (Y, adx- Hf) 
= (Y,[X*, H]) = — (Y, |H, X*). (6.4) 


However, since X is in ga, X* is in g-a, and, so, (6.4) becomes 
(IX, Y], H) = - (Y, -a(H)X*) = a(H) (Y, X"). 


Recall that we take the inner product to have the complex conjugate in the 
first factor. o 


With this lemma established, we turn to the proof of Theorem 6.20. 
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Proof. The proof will be in several steps. Throughout the proof a will be a 
fixed root. 


Step 1. If X is in ga and Y is in g-a, then [X,Y] is in h and [X,Y] is 
orthogonal to all elements of h for which a(H) is zero. 


This is an immediate consequence of Lemma 6.21. 

Let ker a denote the space of all H in b for which a(H) = 0 and let (ker a)* 
denote the orthogonal complement of ker a in h. Then, if h has dimension r, 
ker a will have dimension r—1 and (ker a)+ will have dimension 1. Thus, Step 
1 is telling us that [ga,g—a] is contained in the one-dimensional subspace 
(ker a)+ of bh. This result will be important for us as we continue with the 
proof of Theorem 6.20. 


Step 2. Let X be a nonzero element of ga, so that X* is a nonzero element 
of g-a. Then, [X, X*] # 0 and a([X, X*]) is real and strictly positive. 


To see that [X, X*] is not zero, we apply Lemma 6.21 with Y = X* and 
with H any element of h for which a(H) #4 0. The lemma then shows that 
(X, Y], H} #0, which means that [X,Y] 4 0. 

Now apply Lemma 6.21 with Y = X* and with H = [X, X*]. This gives 


(LX, xX"), [X, X*]) = a([X, X*]) (X, X*) ; 
From the positivity of the inner product and the fact that [X, X*] is nonzero, 
we conclude that a([X, X*]) is real and strictly positive. 


Step 3. We can choose nonzero elements Xa € ga, Ya € g-a, and Ha € b 
such that [H, Xa] = 2Xa, [H, Ya] = —2Yx, and [Xa, Ya] = Ha. ; 

We initially take X to be any nonzero element of ga, Y to be X*, and H to 
be [X,Y] = [X, X*]. Then, [H, X] = a(H)X, where a(H) = a([X, X*]) > 0, 
and [H, Y] = —a(H)Y. We now set 


2 
A, = —— H, 
alH) 
2 
Xa = 1| —— X, 
a 
2 
ey ee 
aE) © 


Direct calculation (check!) then shows that these elements have the required 
commutation relations. 

Note that, on the one hand, [Ha, Xa] = a(Ha)Xa and, on the other hand, 
[Ha, Xa] = 2Xq. So, evidently, a(Ha) = 2. Note also that we have chosen Xa 
and Ya in such a way that Ya = X%. 


Step 4. If 8 is a root of the form 8 = ka for some constant k, then k is an 
integer multiple of 4. 
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We let 5° be the subalgebra of g given by 
s“ = span{ Xa, Ya, Ha}, 


which is isomorphic to sl(2;C). We then let V® be the subspace of g spanned 
by (1) the subspace (kera)+ C h and (2) the root spaces gg for which £ is 
a multiple of a. Recall that (kera)t is one dimensional and that Ha is a 
nonzero element of this space. 

I claim that V® is invariant under the adjoint action of s°. To see this, we 
first show that V® is invariant under adj, which is clear since both (ker a)+ 
and the gg’s are eigenspaces for ady,. Then, we show that V® is invariant 
under adx,,. By Proposition 6.18, for Y in gg, adx,,Y will be in gai. Ifa+ 
is not zero, then ga+g is either zero or another root space with root a multiple 
of a. If a+ £ is zero, then adx, Y will be in (kera)+, by Step 1, and, thus, 
adx,Y isin V®. A similar argument shows that V° is invariant under ady,- 

Thus, V® is invariant under the adjoint action of s%, which means that 
V is a representation of s*. Now, we know that in any finite-dimensional 
representation of s% & sl(2;C), the eigenvalues for Ha must be integers. What 
eigenvalues of ady, arise in V°? Recall that a(H,) = 2. Then, for Y € gg 
with 3 = ka, we will have ady Y = 3(H.)Y = ka(Ha)Y = 2kY. Thus, 2k 
must be an integer, which is what we are trying to show. 


Step 5. If @ is a root, then 2a is not a root. 


We use the complete reducibility of representations of s% = sI(2;C) (which 
comes from the compactness of SU(2)). Note that s% itself is an (irreducible) 
invariant subspace for the adjoint action of s*. By complete reducibility and 
Proposition 4.33, V® decomposes as a direct sum of s* and several other 
irreducible invariant subspaces U;,...,Um. 

Recall that a(H.) = 2. If, then, there were a nonzero element of gaa, it 
would be an eigenvector for ady, with eigenvalue 2a(H,) = 4. This means 
that the eigenvalue 4 for ady, would have to arise in one of the U;’s. (Since 
s% and each of the U;’s is invariant, if 4 is to be an eigenvalue of ady, then 
it must be an eigenvalue in one of these spaces, and it is not an eigenvalue in 
s*.) However, by what we know of the representation theory of sl(2; C), if 4 is 
an eigenvalue of ady, in Ux, then 0 is also an eigenvalue of ady, in Up. (After 
all, the eigenvalues in any irreducible representation go from some maximum 
value of m to —m in increments of 2, and m must be even in order for 4 to 
occur, in which case 0 must also occur.) 

This means that we must have a nonzero vector Z in some Ug C V® with 
ady, Z = 0. The vector Z must be in (kera)+ C h, because V® is the direct 
sum of (kera)+ and eigenspaces for ady, with nonzero eigenvalues (namely 
the gg’s with 8 a multiple of a). However, we have already established that 
(ker v)+ is one dimensional and that Ha is a nonzero element of this space. 
Thus, Z is a nonzero multiple of Ha, which means that s$% and U, have a 
nonzero intersection. This is a contradiction, since V® is the direct sum of $% 
and the various U;’s. 


170 6 Semisimple Lie Algebras 


Step 6. The only multiples of a that are roots are a and —a. 


Suppose 6 = ka is a root (with k # 0, since roots by definition are 
nonzero). By Step 4, k must be an integer multiple of L, and by Step 5, k £ 2. 
We may assume k > 0 (if not, replace G by —@), in which case, k = 1/2, 1, 
3/2, 5/2, 6/2, .... However, now 8 = ta, and precisely the same arguments 
apply with a replaced by 8, and, so, 1/k must also be an integer multiple of 
1/2, with 1/k Æ 2. Of the possible values for k, the only one for which 1/k 
has these properties is k = 1. 


Step 7. The root spaces gq are one dimensional. 


Suppose not. Then there exists another element X’ in ga that is linearly 
independent of Xa. In that case, X’ is an eigenvector for adz, with eigenvalue 
a(Ha) = 2. However, reasoning as in Step 5, if there is another eigenvector in 
V° for ady, with eigenvalue 2 (independent of Xa), then there must also be 
another eigenvector in V® for ady, with eigenvalue 0, which we have seen is 
impossible since the intersection of V® with h is one dimensional. 


It remains only to show the uniqueness of the elements Ha. Since the 
root spaces gg and g-a are one dimensional, H, is certainly unique up 
to a constant. However, this constant is determined by the condition that 
[Ha, Xa] = 2Xq, which is independent of the normalization of Xa since both 
sides are linear in Xa. To say the same thing in a different way, the normal- 
ization of Ha is determined by the condition a(H,) = 2. This concludes the 
proof of Theorem 6.20. o 


6.5 Inner Products of Roots and Co-roots 


We continue with the setting of the previous section: g is a complex semisim- 
ple Lie algebra, € is a compact real form of g, t is a maximal commutative 
subalgebra of £, and h = t+ it is the associated Cartan subalgebra. We choose, 
once and for all, an inner product (-,-} on g that is invariant under the adjoint 
action of K and takes real values on €, and we consider the restriction of this 
inner product to b. 

This section will give a geometric picture of the roots, which were treated 
algebraically in the previous section. 


Proposition 6.22. Suppose a and B are roots and Ha is the co-root associ- 
ated to root a. Then, B(Ha) is an integer. 


Proof. We once again let s% = span{ Xa, Ya, Ha} be the subalgebra of g (iso- 
morphic to sl(2;C)) given by Point 3 of Theorem 6.20. Then, $% acts on g by 
the adjoint action, so that g becomes a finite-dimensional representation of 
5%. From our knowledge of the representations of sl(2;C), we know that any 
eigenvalue of ady, must be an integer. If Xg is any nonzero element of ga, 
then [Ha, Xa] = 8(Ha)Xg, so B(Hq) is an eigenvalue for ady,, which must 
then be an integer. 
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Recall that the roots œ are elements of the dual space h*, whereas the 
co-roots Ha (defined in Theorem 6.20) are elements of h itself. We are going 
to use the inner product on b to identify h* with h and thus putting the roots 
and the co-roots into the same space and giving a more geometric picture of 
the roots and of the integrality condition in Proposition 6.22. We make use of 
the following elementary result from Section B.7. 


Proposition 6.23. Given any linear functional a € b* (not necessarily a 
root), there exists a unique element H® in b such that 


a(H) = (H°, H) 
for all H inb. 


Observe that the notation is H® (not Ha). Recall that we take the inner 
product to be linear in the second factor. The map a > H® is a one-to-one and 
onto correspondence between h* and b. However, this correspondence is not 
linear but rather conjugate-linear, since the inner product is conjugate-linear 
in the first factor (where H® is). 

It is convenient to permanently identify each root a € h* with the corre- 
sponding element H® € bh. Having done this, we then omit the H® notation 
and denote that element of h simply as a. 


Notation 6.24 From now on, we identify each root with the corresponding 
element of h given by Proposition 6.23. Thus, we now regard a root a as a 
nonzero element of h (not h*) with the property that there exists a nonzero X 
in g with 

[H, X] = (a, H) X 
for all H €b. 


This notational change means that we now write (œ, H} every time that 
a( H) occurs in the previous section. So, for example, the assertion that 3(H.) 
is an integer now becomes the assertion that (3, Ha) is an integer. The basic 
properties of the roots (Propositions 6.15 and 6.19) can now be translated 
into our new notation as follows. 


Proposition 6.25. Let R C h be the set of roots in the sense of Notation 
6.24. 


1. Each root a € R is contained in it C b. 
2. The roots span b. 
3. If a is a root, then so is ~a. 


Proof. Points 2 and 3 follow from Proposition 6.19 under the identification of 
h* with þ. (Although the correspondence between h* and b is conjugate-linear 
rather than linear, it still takes spanning sets in h* to spanning sets in b.) 
For Point 1, we note that Proposition 6.15, translated into our new notation, 
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says that (a, H} is imaginary for all roots aœ and all H in t. We now write 
a =a, + iaz with a; and o2 in t. Then, (a,a,) must be imaginary, but also 
(a, 01) = (1,01) — t(ag,a1). Since the inner product (-,-) is real on € and, 
hence, on t, we must then have a; = 0 or else (a, a1) would not be imaginary. 

O 


Proposition 6.26. Let a be a root in the sense of Notation 6.24 and let Ha 
be the corresponding co-root. Then, a and Ha are related by the formulas 


Q 


Ha = a (6.5) 
a= og ry (6.6) 


The real content of this proposition is that once we use the inner product 
to identity h* with h (so that the roots and co-roots now live in the same 
space), a and Hą are multiples of one another. Once this is known, the nor- 
malization is determined by the condition that (a, Ha} = 2, which reflects 
that [Ha, Xa] = 2Xq. Observe that both (6.5) and (6.6) are consistent with 
the relation (a, Ha) = 2. 


Proof. In the previous section, we established that H, belongs to the one- 
dimensional subspace (kera)+ of h. Now that we are thinking of a as an 
element of h instead of h*, kera is equal to {H € b| (a, H) = 0}, which is 
just the orthogonal complement of the span of a. Thus (kera@)+ is equal 
to ((spana)+)+ = spana. This means that a and Ha are multiples of one 
another. Then, as remarked above, the constants for expressing @ in terms of 
Ha and vice versa are determined by the normalization condition (a, Hy) = 2. 

E 


Note that the expression (6.5) for Ha in terms of a is exactly parallel to 
the expression (6.6) for a in terms of Ha. If we substitute (6.6) into (6.5) we 
obtain Ha = 4Ha/(&, a) (Ha, Ha). Thus 


(a, a) (Ha, Ha) =4. (6.7) 


A more symmetric way of expressing the relationship between a and Ha is 
to say that they are multiples of one another and their lengths are related by 
(6.7). 

If we restate Proposition 6.22 using our new point of view on the roots 
and also using (6.5), we obtain that 


(B, Ha) = ee (6.8) 


must be an integer for all roots œ and 8. This implies that (G,a) is a real 
number, and so we can just as well say that 
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is an integer. If we use (6.6) (applied to 3) instead of (6.5), we obtain that 


(Hg, Ha) 


MANE "(Ha He) 


is also an integer. We have obtained, then, the following result. 


Theorem 6.27. Consider the roots in the sense of Notation 6.24 and the co- 
roots defined by Theorem 6.20 and satisfying Proposition 6.26. Then for all 
roots a and (3, the quantities 


aL 6.9 
Taa) PR 
and (He, Ha) 
piza (6.10) 
(Hg, Hg) 
are integers and, furthermore, 
9 (a8) = o (Hs, Ha) 


(a, a) (Ho, He) 


Note that the expressions (6.9) and (6.10) are both equal to (3, Ha) , and 
(8, Ha) is the eigenvalue of ady, in the root space gg. This is the reason 
that these quantities must be integers. Recall from elementary linear algebra 
that if œ and 8 are elements of some inner-product space, then the orthogonal 
projection of 3 onto a is given by 


CON 


(a, a) 


The first quantity in Theorem 6.27 is thus twice the coefficient of a in the 
projection of 8 onto a. We may therefore interpret the integrality result in the 
following geometric way: If a and B are roots, then the orthogonal projection 
of a onto B must be an integer or half-integer multiple of B, and vice versa. 
This condition severely restricts the possible angles between a and £ and (if a 
and 8 are not orthogonal) the possible ratios of their lengths. See Proposition 
8.6. 


6.6 The Weyl Group 


We use here the compact group approach to defining the Weyl group, as 
opposed to the Lie algebra approach. The compact-group approach makes 
certain aspects of the Weyl group more transparent. Nevertheless, the two 
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approaches are equivalent. See the comments following Theorem 6.33 at the 
end of this section. 

We continue with the setting of the previous section. Thus, g is a complex 
semisimple Lie algebra given to us as a subalgebra of some gl(n;C). We have 
chosen a compact real form € of g and we let K be the compact subgroup 
of GL(n;C) whose Lie algebra is £. We have chosen a maximal commutative 
subalgebra t of £, and we work with the associated Cartan subalgebra h = t+it. 
We have chosen an inner product on g that is invariant under the adjoint action 
of K and that takes real values on €. 

Consider the following two subgroups of K: 


Z(t) = {4 € K|Ad,(H) =H for all H in t}, 
N(t) = {4 € K |Ada(H) C t for all H in t}. 


Clearly, Z(t) is a subgroup of N (t), and it is easily seen that Z(t) is a normal 
subgroup of N (t). See Exercise 10 for an explanation of the notation. If T is 
the connected Lie subgroup of K with Lie algebra t, then T C Z(t), since T 
is generated by elements of the form exp H with H in t. It turns out that, in 
fact, Z(t) = T. See Bröcker and tom Dieck (1985). 


Definition 6.28. The Weyl group for g is the quotient group W = N(t)/Z(t). 


We can define an action of W on t as follows. For each element w of W, 
choose an element A of the corresponding equivalence class in N(t). Then for 
H in t we define the action w- H of w on H by 


w- H = Ada (H). 


As in the SU(3) case (Section 5.6), it is easy to verify that this action is well 
defined (i.e., independent of the choice of A in a given equivalence class). Since 
h = t+ it, each linear transformation of t extends uniquely to a complex-linear 
transformation of h. Thus, we also think of W as acting on b. If w is an element 
of the Weyl group, then we write w- H for the action of w on an element H of 
b. It is easily seen that W is isomorphic to the group of linear transformations 
of h that can be expressed as Ady for some A € N(t). 


Proposition 6.29. 


1. The inner product (-,-} on h is invariant under the action of W. 

2. The set R C h of roots is invariant under the action of W. 

3. The set of co-roots is invariant under the action of W, and w: Ha = Hy.a 
forallw EW anda €R. 

4. The Weyl group is a finite group. 


Proof. The action of W on b is nothing but the adjoint action of N (t) C K on 
g, restricted to h. Since the inner product on g is invariant under the adjoint 
action of K, the restriction of this inner product to h is invariant under the 
adjoint action of N (t). This establishes Point 1. 
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For Point 2, given an element w of the Weyl group, let A € K be an 
element of N(t) that represents w. Now, suppose that a € h is a root. Then 
(Notation 6.24), there exists a nonzero element X of g such that 


[H, X] = (a, H) X 


for all H in h. Now, let us consider the element Ad4(X) of g and compute 
how b acts on it. For H in þh, we have 


[H, Ad4(X)] = Ada([Ad4-1(#), X]), (6.11) 


because Ady is a Lie algebra automorphism. Since A is in N(t), Ad4-1(#) is 
again in h and so 


[Ada-: (H), X] = (a, Ada (H)) X. 
Thus, (6.11) becomes 
[H, Ad4(X)] = (a, Ad4-1(H)) Ada (X). (6.12) 
Since the inner product on h is invariant under Ad, we have 
(a, Ada~:(H)) = (Ad4(a), H) = (w-a,H). 
Thus, (6.12) becomes 
[H, Ad4(X)] = (w - a, H) Ada(X). 


This shows that w-a is a root with root vector Ad4(X) (= w-X), establishing 
Point 2. 
For Point 3, we write 


a 
Ha =2 ; 
(a, a) 
as in Proposition 6.26. Then 
E e T A AE E oa, 
(a, a) (w-a,w-a) 


where in the second equality we have used the invariance of the inner product 
under the action of W. 

Finally, we note that since the roots span b, the action of an element w 
on § is determined by what w does to the roots. However, each w preserves 
the set of roots and thus we may think of each w as a permutation of the set 
of roots, and there are only finitely many of these. o 


Let us now compute the Weyl group of sl(2; C). This is a worthwhile ex- 
ercise in its own right, and, as with other sl(2;C) calculations, it will aid us 
in our analysis of general semisimple Lie algebras. We consider the compact 
real form € = su(2) of sl(2;C) and the maximal commutative subalgebra t of 


€ given by 
ia 0 
Sy Za) cer. 
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Theorem 6.30. The subgroup Z(t) of SU(2) is given by 


er 0 
Z(t) = { ( 0 =) ocr} 
and N(t) is the set of matrices A in SU(2) that are either in Z(t) or of the 
form 
0 ete 
A= ee i ) (6.13) 


fora in R. The Weyl group N(t)/Z(t) has two elements. For any H inb = t+it 
and for any A of the form (6.13), we have 


AHA™ = —H. 


Proof. Let Ho be the element of t given by 


i 0 
i= (5 5.). 


Now, suppose that A € SU(2) commutes with each element of t and thus, in 
particular, with Ho. Since A commutes with Ho, it must preserve each of the 
eigenspaces for Hp, which are Ce, and Cez, where e; and ez are the standard 
basis elements for C?. Thus, Ae; must equal cje; and Aez must equal c2é2, 
for some constants cı and cz. Thus, 


2 Cy 0 
1=(39). 
We also require that A be in SU(2), which means that |ci| = |c2| = 1 and that 


C2 = 1/cı. Thus, 
e? 0 
a) (ou 


for some a in R. Clearly, also, every matrix of the form (6.14) does, in fact, 
commute with every element of t, so Z(t) is precisely the group of matrices of 
this form. 

We now compute N(t). If A € N(t), then, in particular, AHọo A~! must be 
in t. Now, the eigenvectors of Hp are e and e€2, with eigenvalues i and —i, 
respectively. The eigenvectors of AHj A`! are then (check!) Ae; and Aes, with 
the same eigenvalues 7 and —i. However, every element of t has eigenvectors 
e} and e2, so we must have either Ae, = cye, and Aez = c2€2 (in which case, 
A E€ Z(t)) or Ae; = cieo and Aez = coe, in which case, 


_ 0 ce 
a 


Then, for A to be in SU(2), we must have |c1| = |c2| = 1 and cp = —1/c1, so 
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0 e 
ey, 


If A is of this form, then we compute that 


ia 0 -1_ {ta 0 
a(@ © ata (8) en (19 


and, indeed, A is in N(t). Therefore, the elements of N(t) are precisely ma- 
trices of the form (6.14) or (6.13). 

Now, a matrix of the form (6.13) is not in Z(t). However, if A and B are two 
matrices of the form (6.13), then direct calculation shows that AB~' € Z(t). 
This shows that the quotient group N(t)/Z(t) has precisely two elements. Let 
us see how the Weyl group N(t)/Z(t) acts on t. For A of the form (6.13), the 
restriction of Ad, to t will be —J (by (6.15)), and for A of the form (6.14), 
the restriction of Ad, to t is I. So, the Weyl group may be identified with the 
two-element group {J,—J} inside GL(t) = GL(1; R). o 


For application to general semisimple Lie algebras, it is useful to describe 
the Weyl group in Lie algebraic terms. So, let X, Y, and H be the usual basis 
elements for sl(2; C). Then, the matrix 


z% =)= e i 


is in su(2) and by the calculation in Section 2.2, we have 


exp|5(X = Y)| = È , , (6.16) 


which represents the one nontrivial element of the Weyl group of sl(2;C). 
(Here, m is the number 7 = 3.14..., not a representation.) Then, for any H in 
b, we have 


TT 
Adexplin/2(x—-vy Ht = exp| 5 (adx = ady)| (H) = —H. (6.17) 


The element on the right-hand side of (6.16) can also be computed as 


rera (A-L) es 


We are now ready to apply our knowledge of the Weyl group of sl(2;C) 
to obtain information about the Weyl group of a general complex semisimple 
Lie algebra g. We now resume the context that g is a complex semisimple 
Lie algebra, € is a fixed compact real form, t is a fixed maximal commutative 
subalgebra of £, and h = t + it. 
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Theorem 6.31. For each root a, there exists an element wa of W such that 


Wa'A@=-a 
and such that 
wa: H =H 
for all H in h with (a, H) = 0. 
Note that since Ha is a multiple of a, saying wa -a = —a is equivalent to 


saying that wa : Ha = —Hy. 

The linear operator corresponding to the action of wa on b is “the reflection 
about the hyperplane perpendicular to a.” This means that wa acts as the 
identity on the hyperplane (codimension-one subspace) perpendicular to a 
and as minus the identity on the span of a. We can work out a formula for 
Wa as follows. Any vector 8 can be decomposed uniquely as a multiple of a 
plus a vector orthogonal to a. This decomposition is given explicitly by 


pe) Ce. (s- aoa l (6.19) 


(a, a) (a, a) 


where direct calculation shows that the second term is indeed orthogonal to 
a. (The projection of 8 onto a is a linear function of 8 and so 8 must go in 
the “linear” side of the inner product, which for us is the right-hand side.) 
Now, to obtain wa - 3, we should change the sign of the part of 8 parallel to 
a and leave alone the part of 8 that is orthogonal to a. This means that we 
change the sign of the first term on the right-hand side of (6.19), giving 


(a, p) 
(a, a) 
We now have another way of thinking about the quantity 2(a, 3)/(a, a) 


in Theorem 6.27: It is the coefficient of a in the expression for we - 3. So, we 
can re-express Theorem 6.27 as follows. 


Wa B= B-2 


a. (6.20) 


Corollary 6.32. If a and 8 are roots, then B — wa - B is an integer multiple 
ofa. 


Note that for any @ in h, whether a root or not, 6 — wa: 8 will be a 
multiple of a, as a consequence of the formula (6.20) for wa. The content of 
the corollary is that if G happens to be a root, then 8 — wa - 8 is an integer 
multiple of a. 

We now turn to the proof of Theorem 6.31. 


Proof. We choose elements Xa, Ya, and Ha as in Point 3 of Theorem 6.20. As 
shown in Step 3 of the proof of that theorem, it is possible to choose Xa and 
Ya so that Ya = X%, in which case, Xa — Yq will be contained in £. (Recall 
the definition (6.2) of X*.) We then let A, be the element of K given by 
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Aa = exp|5 (Xa = Ya)] (6.21) 


We want to show that A, is in N(t) and that Ad4, acts in the indicated way 
on b. 

Suppose, first, that H is in h and that (a, H) = 0. Then, [H, Xa] = 
(a, H)Xq = 0 and similarly for Ya; that is, Xa and Y, commute with H. 
Therefore, using the relationship between Ad and ad (Proposition 2.25), we 
have 


Ada, (H) = exp| 5 (adx, = adya )| (H) 
=H. 


Now, let us consider the action of Ad4, on the one-dimensional subspace 
of g spanned by Ha (or by a). We have, as above, 


Ada,(Ha) = exp [F(adx, = adya )| (Ha). (6.22) 


The right-hand side of (6.22) involves only Lie algebra quantities. Thus, since 
the Lie algebra spanned by Xa, Ya, and Ha is isomorphic to the one spanned 
by the usual sl(2;C) basis elements, the result of computing the right-hand 
side of (6.22) will be the same as in sl(2;C), which we have computed above 
in (6.17). We obtain, then, that 


Ada, (Ha) = —Ha- (6.23) 


See Exercise 8 for an alternative calculation of this result. 

So, Ady, acts as the identity on elements H with (a, H) = 0, and Ada, 
acts as minus the identity on the span of Ha, which is the same as the span 
of a. This shows that A, represents an element of the Weyl group that acts 
in the indicated way on b. o 


The element Aa in (6.21) can also be computed as 
Agere es, (6.24) 


(Compare (6.18).) Direct computation shows that the elements in (6.21) and 
(6.24) are equal in the sl(2;C) case and, then, the argument in the proof of 
Theorem 6.31 shows that they are equal in general. The description of the 
Aq’s given in (6.24) will be useful in the Verma module construction of the 
representations of g. See Section 7.3. 


Theorem 6.33. The Weyl group W is generated by the elements wa as a 
ranges over all roots. 


That is to say, the smallest subgroup of W that contains all of the wa’s 
is W itself. This is somewhat involved to prove and I will not do so here; see 
Bröcker and tom Dieck (1985). In the Lie algebra approach to the Weyl group, 
the Weyl group is defined as the set of linear transformations of h generated 
by the reflections wa. Theorem 6.33 shows that the Lie algebra definition of 
the Weyl group gives the same group as the compact-group approach. 
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In the previous sections, we have established several properties of the roots. 
From Proposition 6.15, we know that the roots are imaginary on t, which, 
after transferring the roots from h* to h (as in Notation 6.24), means that 
the roots live in it C h. The inner product (-,-) was constructed to take real 
values on € and, hence, on t. The inner product then also takes real values 
on it, since (iX, iY) = (—i)i (X,Y) = (X,Y). So, the roots live in the real 
inner-product space E = it. 

From Proposition 6.19 and Theorem 6.20 we know that the roots span it 
and that if a is a root, then —a is a root but no other multiples of a are 
roots. Theorem 6.27 tells us that for any roots a and 8, 2(a, 3)/(a,a) is an 
integer. Proposition 6.29 tells us that the roots are invariant under the action 
of the Weyl group, and Theorem 6.31 tells us that the Weyl group contains 
the reflection about the hyperplane orthogonal to each root a. We summarize 
these results in the following theorem. 


Theorem 6.34. The roots form a finite set of nonzero elements of a real 
inner-product space E and have the following properties: 


1. The roots span E. 

2. If a is a root, then —a is a root and the only multiples of a that are roots 
are œa and —a. 

3. If a is a root, let wa denote the linear transformation of E given by 


(a, p) 
(a, a) 


Then, for all roots a and 3, Wa: B is also a root. 
4. If a and B are roots, then the quantity 


(a, 8) 


(a, a) 


Q. 


wa: B =ß-2 


2 


is an integer. 


Any collection of vectors in a finite-dimensional real inner-product space 
having these properties is called a root system. The Weyl group for a root 
system R is the group of linear transformations of E generated by the wa’s. 
We will look more closely at the properties of root systems in Chapter 8. Note 
that Point 4 is equivalent to saying that 8 — wa: 8 must be an integer multiple 
of a for all roots a and £. 

We have also established certain important properties of the root spaces 
that are not properties of the roots themselves, namely that each root space 
Ga is one dimensional and that out of ga, g-a, and [ga; 9-a], we can form a 
subalgebra isomorphic to sl(2;C). 

Finally, I claim that the co-roots H (as in Point 3 of Theorem 6.20) 
themselves form a root system. Theorem 6.27 tells us that the co-roots satisfy 
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Property 4 and Proposition 6.29 tells us that the set of co-roots is invariant 
under the Weyl group and hence, in particular, under the reflections wa. 
However, note that since Ha is a multiple of a, the reflection generated by Ha 
is the same as the reflection generated by a. Thus, the set of co-roots satisfies 
Property 3. Properties 1 and 2 for the co-roots follow from the corresponding 
properties for the roots, since each H, is a multiple of a. The set of co-roots 
is called the “dual root system” to the set of roots. See Chapter 8 for more 
information on root systems, including many pictures. 


6.8 Positive Roots 


In the next chapter, we will classify the irreducible representations of g in 
terms of a “highest weight.” In the sl(3; C) case, we defined “highest” in terms 
of the two roots a; and a2. There is nothing sacred about those particular 
two roots. What we need is simply some consistent notion of higher and lower 
that will allow us to divide the root vectors Xa into “raising operators” and 
“lowering operators.” This should be done in such a way that the commuta- 
tor of two raising operators is, again, a raising operator and not a lowering 
operator. This means that we want to divide the roots into two groups, one 
of which will be called “positive” and the other “negative.” This should be 
done is such a way that if the sum of positive roots is again a root, that root 
should be positive. There is no unique way to make the division into positive 
and negative; any consistent division will do. The uniqueness theorems of the 
next section show that it does not really matter which choice we make. 

The following definition and theorem shows that it is possible to make a 
good choice. 


Definition 6.35. Suppose that E is a finite-dimensional real inner-product 
space and that R C E is a root system. Then, a base for R is a subset 
A = {a,...,ar} of R such that A forms a basis for E as a vector space and 
such that for each a € R, we have 


A=nN1A,+NQA2 +e E NrAr, 


where the n;’s are integers and either all greater than or equal to zero or all 
less than or equal to zero. 

Once a base A has been chosen, the a’s for which nj > 0 are called the 
positive roots (with respect to the given choice of A) and the a’s with nj < 0 
are called the negative roots. The elements of A are called the positive 


simple roots. 


Therefore, to be a base (in the sense of root systems), A C R must, in 
particular, be a basis for E in the vector space sense. In addition, the expansion 
of any a € R in terms of the elements of A must have integer coefficients and 
all of the nonzero coefficients (for a given a) must be of the same sign. 


182 6 Semisimple Lie Algebras 


Theorem 6.36. For any root system, a base exists. 


The proof of Theorem 6.36 is given in Section 8.3. In the case of sl(3;C), 
one should verify that any pair of roots with a 120° angle constitutes a base, 
but that a pair of roots with a 60° angle does not constitute a base. See 
Chapter 8 for additional examples and pictures. 


Proposition 6.37. If R is the set of roots of g relative to h = t+ it and 
if A = {ay,...,ar} is a base for R, then {Ha,,..., Ha} is a base for the 
system of co-roots. 


We will prove this in Chapter 8. 


6.9 The sl(n;C) Case 


Let us see how all of the structures described in this chapter work out in the 
case g = sl(n;C). For calculations in the case of other classical Lie algebras, 
see Exercises 12, 13, and 14 in this chapter and Section 8.8. 


6.9.1 The Cartan subalgebra 


We work with the compact real form € = su(n) and the maximal commutative 
subalgebra t which is the intersection of the set of diagonal matrices with 
su(n); that is, 

iai 

t= da aj ER. a+ +an=0}. (6.25) 
1An 

It is clear that t is a commutative subalgebra of €; that t is mazimal com- 
mutative will be evident once we compute the roots. The associated Cartan 
subalgebra is then h = t+ it, so that 6 is the set of all diagonal matrices with 
trace zero: 

Ài 


6.9.2 The roots 


Now, let Epı denote the matrix that has a one in the k* row and J*® column 
and that has zeros elsewhere. A simple calculation shows that if H € t is as 
in (6.26), then H Epi = Àk Ex, and Ep H = A Ext. Thus, 


[H, Eri] = (Ax — Ar) Ext. (6.27) 
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If k = l, then Ey; does not have trace zero and so is not in sl(n;C). If k 4 l, 
then Ex is in sl(n;C) and (6.27) shows that Ex, is a simultaneous eigenvector 
for each ady with H in h, with eigenvalue A, — à. Note that every element 
X of sl(n;C) can be written uniquely as an element of the Cartan subalgebra 
(the diagonal entries of X) plus a linear combination of the Eps with k 4 1 
(the off-diagonal entries of X). From this it is not hard to see that t is actually 
maximal commutative and, therefore, that h is actually a Cartan subalgebra 
(Exercise 15). 

If we think at first of the roots as elements of *, then (according to 
(6.27)) the roots are the linear functionals ay; that associate to each H € b, 
as in (6.26), the quantity A; — à. Note that aj, = —&pı but that no other 
multiple of ax; is a root. Also, each root space is one dimensional—the span 
of Ex. For each root a = az, we may take Xa = Ex, Yo = Eik, and 
Ha = (Xa, Ya] = Ekk — Eu. The subalgebra spanned by Xa, Yo, and Ha is 
just the copy of sl(2;C) inside sl(n;C) in which all the action is in the k*® and 
I coordinates. 

The roots of sl(n; C) form a root system that is conventionally called A,_1, 
with the subscript n—1 indicating that the rank of sl(n; C) (i-e., the dimension 
of h) isn —1. 


6.9.3 Inner products of roots 
We use the Hilbert-Schmidt inner product on sl(n; C), namely 
(X,Y) = trace(X*Y), 


where X* is the usual matrix adjoint of X. This inner product is invariant 
under the adjoint action of SU(n), as is easily verified (Exercise 9). When we 
restrict this inner product to h we get the “obvious” inner product on h, in 
which the inner product of diag(\i,...,An) with diag(o1,...,0,) is equal to 
Aigi +++: +AnOn- (Here diag(-) is the diagonal matrix with the indicated 
diagonal entries.) If we use this inner product as in Notation 6.24 to transfer 
the roots from h* to h, we obtain 


On = Ekk — Eu (k #1). 
We see, then, that each root satisfies 
(akl, Qk) = 2. 


Furthermore, 

(akl, apt) 
has the value 0, +1, or +2, depending on whether {k,l} and {k’,l’} have zero, 
one, or two elements in common. Thus 


yoni AKE) © 69 41, 49}. 
(akl, Mkt) 
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Note that all the roots have length v2. If a and @ are roots and a # 3 
and a # —G, then the angle between a and £ is either 60° (if (a, 8} = 1), 
90° (if (a, 8) = 0), or 120° (if (a, 8) = —1). (In other root systems, the roots 
may not all have the same length, and other angles can arise.) Since each root 
for sl(n;C) has length 2 (with this choice of inner product), we see that 
the co-roots Ha = 2a/(a,a) simply coincide with the roots a. (In other root 
systems, the system of co-roots may be inequivalent to the system of roots.) 


6.9.4 The Weyl group 


Following the argument in the computation of the Weyl group of sl(2;C), we 
see that Z(t) is the subgroup of K = SU(n) consisting of diagonal matrices, 
and N(t) is the set of matrices A in SU(n) such that for each k there exists an 
l such that Ae, is a multiple of e;. This means that associated to each A in 
N(t) there is a permutation (k — l). So, the Weyl group is isomorphic to the 
permutation group Sn and it acts on h by permuting the diagonal entries. For 
each a = ax, the Weyl group element wa acts by interchanging the kt? and 
i. diagonal entries, and the Weyl group is generated by such interchanges. 


6.9.5 Positive roots 


Finally, we can find a base as follows. We take as our base the roots ay; with 
l=k+1 (ie, the roots ax,,41). Recall that ay; = Ex, — Ey. If k < l, we 
note that 


Egk — Eu = (Ekk — Ex—1je—1) + (Ek-1,k-1 — Ex—2,k—-2) +++ °+ (Erti 141 — Eu) 


and, so, 
Qkl = Ak-1,k + Qk-1,k—2 t't + O141- 


Thus, every root ax; with k < l can be written as a linear combination of the 
simple roots with non-negative integer coefficients. (In fact, the coefficients 
are either 0 or 1.) So, the axi’s with k < l are the positive roots and the a,,’s 
with k > l are the negative roots. Every negative root is just the negative of 
a positive root and so can be written as a linear combination of the positive 
simple roots with nonpositive integer coefficients. 


6.10 Uniqueness Results 


Given a complex semisimple Lie algebra g, we have made three choices: a com- 
pact real form € of g, a maximal commutative subalgebra t of £, and a system 
of positive roots. None of these objects is unique, and so it is important to 
think about how different possibilities are related. Furthermore, we have only 
considered Cartan subalgebras that arise as the complexification of maximal 
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commutative subalgebras of a compact real form. It is not obvious that every 
Cartan subalgebra arises in this way. 

The following theorems take care of all such worries. The following theo- 
rems tell us that each structure is unique up to the adjoint action of relevant 
group (G, K, or W). 

We assume that g is given to us as a subalgebra of some gl(n;C), and we 
let G denote the connected Lie subgroup of GL(n; C) with Lie algebra g. 


Theorem 6.38. Suppose that £; and tz are two compact real forms of g. Then, 
there is an element A of G such that Ad4 (t1) = t2. 


Theorem 6.39. Suppose that £ is a compact real form of g and that K is the 
compact subgroup of G whose Lie algebra is t. Suppose that tı and tz are two 
maximal commutative subalgebras of t. Then, there is an element A of K such 
that Ada(t1) = te. 


Theorem 6.40. Jf h is a Cartan subalgebra of g, then there exists a compact 
real form £ of g and a maximal commutative subalgebra t of è such that h = 
t+ it. If hi and be are two Cartan subalgebras of g, then there exists A € G 
such that Ad4(hi) = be. 


Theorem 6.41. Any two systems of positive simple roots can be mapped into 
one another by the action of the Weyl group. 


In the last theorem, it should be understood that different orderings of 
Q1,...,@, count as the same system of positive simple roots. 

I will not prove these results. Exercises 16, 18, and 19 provide proofs of 
some of these results in the case g = sl(n; C). 

A related issue is the extent to which one can recover a semisimple Lie 
algebra from its root system. This will be discussed in Chapter 8. The result 
is that if gı and g2 are complex semisimple Lie algebras and the root system 
for gı is isomorphic (in the appropriate sense) to the root system for g2, then 
gı and ge are isomorphic Lie algebras. See Theorem 8.28. 
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1. Show that the center of any semisimple Lie algebra g is trivial. Show that 
the adjoint representation of g is faithful. 

2. Suppose that g is a complex Lie algebra with the complete reducibility 
property. Show that g is semisimple. 
Hint: First show that a one-dimensional commutative Lie algebra does 
not have the complete reducibility property. 

3. Let hc denote the complexification of the Lie algebra of the Heisenberg 
group, namely the space of all complex 3 x 3 upper triangular matrices 
with zeros on the diagonal. 
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(a) Show that every maximal commutative subalgebra of hc is two dimen- 
sional and contains the center of hc. 
(b) Show that hc does not have any Cartan subalgebras. 


. Give an example of a maximal commutative subalgebra of sl(2;C) that is 


not a Cartan subalgebra. 


. Verify Proposition 6.18 by direct calculation in the case g = sl(3;C), using 


the Cartan subalgebra h = span{ H1, H2}. 


. Let g denote the vector space of 3 x 3 complex matrices of the form 


(00), 


where A is a 2 x 2 matrix with trace zero and B is an arbitrary 2 x 1 
matrix. 

(a) Show that g is a subalgebra of M3(C). 

(b) Let X, Y, H, e1, and e2 be the following basis for g. We let X, Y, and 
H be the usual sl(2;C) basis in the “A” slot, with B = 0. We let e; be the 
matrix with a 1 in the first slot in B and zeros everywhere else, and we 
let e2 be the matrix with a 1 in the second slot of B and zeros everywhere 
else. Compute the commutation relations among these basis elements. 
(c) Show that g has precisely one nontrivial ideal, namely the span of e, 
and e2. 

Hint: First, determine the subspaces of g that are invariant under the 
adjoint action of the sl(2;C) algebra spanned by X, Y, and H, and then 
determine which of these subspaces are also invariant under the adjoint 
action of e and e2. In determining the sl(2;C)-invariant subspaces, use 
Exercise 15 of Chapter 4. 

(d) Show that the one-dimensional subspace of g spanned by the element 
H is a Cartan subalgebra of g and determine the associated roots. 

(e) Is g semisimple? 


. Let g, t, K, t, and h be as usual in this chapter. Consider a root a and 


the associated subalgebra s* = span{ Xa, Yo, Ha}, where Xa and Ya are 
chosen so that Ya = X% (where X* is defined by (6.2)). We use, as usual, 
an inner product on g that is invariant under the adjoint action of K. 
Suppose that V is any subspace of g that is invariant under the adjoint ac- 
tion of s*. Show that the orthogonal complement V+ of V is also invariant 
under s®. 


. Let s* be the complex Lie algebra spanned by Xa, Ya, and Ha, which 


satisfy the usual sl(2;C) commutation. Show that the elements 


Xa Yas 
Xa + Ya +iHao, 
Aa Era iHa 


are eigenvectors for the operator adx, —ady,,. Compute directly the quan- 
tity 


10. 


11. 


12. 
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exp [Z edx, — ady, )| (Ha). 


Compare (6.23) in the proof of Theorem 6.31. (Here 7 is the number 
3.14..., not a representation.) 


. Show that the Hilbert-Schmidt inner product on sl(n;C), (X,Y) = 


trace(X*Y), is invariant under the adjoint action of SU(n). 

Let g be a complex semisimple Lie algebra contained in gl(n;C), let È be 
a compact real form of g, and let K be the compact subgroup of GL(n; C) 
whose Lie algebra is t. Now, let t be a maximal commutative subalgebra 
of £ and let T be the connected Lie subgroup of K whose Lie algebra is t. 
(a) Prove that T is closed in K. 

Hint: Show that the closure of T is connected and commutative. 

(b) Now, let Z(T) be the centralizer of T in K, (i.e., the set of all A in 
K such that AtA~! = t for all t € T). (The centralizer of T is the largest 
subgroup H of K that contains T and such that T is in the center of H.) 
Show that Z(T) coincides with Z(t) as defined in Section 6.6. 

(c) Let N(T) denote the normalizer of T in K (i.e., the group of all A in 
K such that AtA~! €T for all t € T). (The normalizer of T is the largest 
subgroup H of K that contains T and such that T is normal in H.) Show 
that N(T) coincides with N(t) as defined in Section 6.6. 

Note: The group T C K is a “maximal torus” and the customary def- 
inition of the Weyl group (from the compact group point of view) is 
W = N(T)/Z(T). See Brécker and tom Dieck (1985). 

Continue with the notation of Exercise 10. Suppose that g = sl(n; C), 
t = su(n), and t is the diagonal subalgebra of su(n). Show that T is indeed 
a torus (i.e., isomorphic to S! x $1 x --- x $1). Show that Z(t) = T. 
Consider the complex semisimple Lie algebra so(4;C) with compact real 
form so(4). Consider the space t of matrices of the form 


(6.28) 


with a,b € R. This is a maximal commutative subalgebra of so(4) (assume 
that this is so). Thus, the space h of such matrices with a,b € C is a Cartan 
subalgebra, of so(4; C). 

Now, consider the matrices of the form 


0 C 
-Ctr 0 ’ 


where C is one of the following 2 x 2 matrices: 
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Show that each of the resulting elements of so(4; C) is a root vector and 
show that the corresponding roots are given by a; = i(a+ b), ag = 
—i(a + b), a3 = i(a — b), and a4 = —i(a — b). Here, we are thinking of 
the roots as elements of h* and, for example, i(a + b) means the linear 
functional that associates to the matrix (6.28) the number i(a + b). Show 
that the roots i(a + b) and i(a — b) form a base for this root system. 
Now, consider on so(4; C) the inner product (X,Y) = trace(X*Y), which 
is invariant under the adjoint action of SO(4). Use this inner product 
to identify h* with þh (as in Section 6.9) and compute the elements of h 
that represent a1, @2,a@3, and a4 under this identification. Show that the 
elements of the base in the previous paragraph are orthogonal with respect 
to the given inner product. 

Consider the complex semisimple Lie algebra so(5;C) with compact real 
for so(5). Consider the space t of matrices of the form 


0a 
—a 0 
0b (6.29) 
—b 0 
0 


with a,b € R. This is a maximal commutative subalgebra of so(5) (assume 
that this is so). Thus, the space h of such matrices with a, b € C is a Cartan 
subalgebra of so(5; C). 

Show that the matrices of the form 


0c 
-cto |, (6.30) 
0 


where C is one of the matrices in the previous problem, are root vectors, 
with roots given by the same formulas as in the previous problem. Show 
that matrices of the form 


0 
0 
of, 1 (6.31) 
+ 
—1 Fi00 0 00-14% 0 


are also root vectors with roots +ia and +ib, respectively. Show that the 
roots i(a — b) and ib form a base for this root system. 
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Now, as in the previous problem, use the inner product given by (X,Y) = 
trace(X*Y) to identify h* with b. Show that the roots associated to the 
root vectors (6.30) have length v2 longer than the root vectors in (6.31). 
Show that the angle between the two elements of the base in the previous 
paragraph is 135°. 

14. Consider the complex semisimple Lie algebra sp(2;C) c M4(C) and the 
compact real form sp(2) = sp(2;C)Mu(4). Consider the space t of matrices 
of the form 

ia 0 
0 ib 
—ia 0 
0 —ib 
(a,b € R). This is a maximal commutative subalgebra of sp(2) (assume 
that this is so). Thus, the space h of such matrices with a,b € C is a 
Cartan subalgebra of sp(2; C). 
Show that the following matrices are root vectors for h with roots i(a +b), 
—i(a + b), i(a — b), and —i(a — b), respectively: 


01 00 
10 00 
00 ; 01 : 
00 10 
01 00 
00 10 
oo]: 0-1 (6.32) 
—10 0 0 


Show that the following matrices are root vectors for h with roots 2ia, 
—2ia, 2ib, and —2ib, respectively: 


10 00 
00 00 
00 10 i 
00 00 
00 00 
01 00 
00 ; 00 (6.33) 
00 01 


Note that the roots for sp(2;C) are given by the same formulas as for 
so(5; C), except that for sp(2;C), we have +2ia and +2ib, whereas for 
so(5; C), we have tia and +ib. Show that the roots i(a — b) and 2ib form 
a base for this root system. 

Now, as in the previous two problems, use the inner product (X,Y) = 
trace(X*Y) to identify h* with h. Show that the roots in (6.32) are V2 
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shorter than the roots in (6.33). Show that the angle between the two 
elements of the base in the previous paragraph is 135°. 

Show that the subalgebra t of su(n) given in Section 6.9 is maximal com- 
mutative. 

Hint: If X is any matrix in su(n) that commutes with every H in t, write 
X as an element of t plus a linear combination of the Epy’s with k 4 l. 
Suppose that € is a compact real form of the complex semisimple Lie 
algebra sl(n;C). Let K be the compact subgroup (Proposition 6.8) of 
SL(n; C) whose Lie algebra is €. 

(a) Show that there exists an inner product on C” that is invariant under 
the action of K. 

(b) Show that € consists precisely of those matrices that are skew self- 
adjoint with respect to this inner product and have trace zero. 

(c) Show that there exists an element A of SL(n;C) such that Ad4 (t) = 
su(n). 

This establishes the uniqueness result in Theorem 6.38 for the case g = 
sl(n; C). 

(a) Suppose that X € sl(n;C) is diagonalizable. Show that adx 
sl(n;C) > sl(n; C) is diagonalizable. 

(b) Suppose that N € sl(n;C) is nilpotent. Show that ady : sl(n;C) > 
sl(n; C) is nilpotent. 

(c) Suppose that X € sl(n;C) is such that adx : sl(n;C) — sl(n;C) is 
diagonalizable. Show that X is diagonalizable. 

Hint: What is the SN decomposition of X? 

Suppose that h is an arbitrary Cartan subalgebra of sl(n; C). 

(a) Show that the elements of h are simultaneously diagonalizable. 

(b) Show that there exists g € SL(n; C) such that ghg~' = ho, where bo 
denotes the diagonal subalgebra of sl(n; C). 

(c) Show that there exists a compact real form € of sl(n;C) and a maximal 
commutative subalgebra t of £ such that h = t + it. 

Use Exercise 17. This establishes the uniqueness result in Theorem 6.40 
for the case g = sl(n;C). . 

Let t be an arbitrary maximal commutative subalgebra of su(n). 

(a) Show that the elements of t are simultaneously diagonalizable. 

(b) Show that there exists an element A of SU(n) such that Ad4(t) is the 
space of diagonal matrices in su(n). 

This (together with Exercise 16) establishes the uniqueness result in The- 
orem 6.39 for the case g = sl(n; C). 


7 


Representations of Complex Semisimple Lie 
Algebras 


In this chapter, we will study the finite-dimensional irreducible representations 
of a complex semisimple Lie algebra g. These will be classified by means of 
a “theorem of the highest weight.” The theorem states that every irreducible 
representation has a (unique) highest weight, that two irreducible representa- 
tions with the same highest weight are equivalent, and that the elements that 
actually arise as highest weights of irreducible representations are precisely 
the “dominant integral” elements. 

Now that we have developed (in the previous chapter) the relevant struc- 
tures for semisimple Lie algebras, most of the proof of the theorem of the 
highest weight goes precisely as in the case of sl(3;C). Nevertheless, there is 
one part of the proof that cannot be done the way we did the sl(3;C) case, 
namely showing that every dominant integral element actually arises as the 
highest weight of some irreducible representation. For this step, we need some 
method of constructing representations, in contrast to the rest of the proof, 
in which we assume that we are given a representation and we start analyzing 
it. 

In the sI(3; C) case, we constructed the representations by starting with the 
standard representation and the dual of the standard representation and then 
taking subspaces of tensor products of these two representations. Although 
similar methods can be used (as in Fulton and Harris (1991)) for other classical 
groups, this method does not work in general. 

For a general semisimple Lie algebra, there are three standard methods of 
constructing the representations. The first is a purely Lie-algebraic approach 
using Verma modules. The second method constructs the representations as 
representations of the associated simply-connected compact group and uses 
the Peter-Weyl theorem and the Weyl character formula. The third method 
constructs the representations as representations of the complex Lie group G 
whose Lie algebra is g. In this approach, G acts on the space of holomor- 
phic functions that transform in a certain way under the action of a certain 
subgroup B of G. This approach is called Borel—Weil theory. 
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We will give an essentially complete treatment of the Verma module ap- 
proach. For the other two approaches, I provide a detailed outline with refer- 
ences for further reading. i 

In Chapter 8 we will work out examples for semisimple Lie algebras of 
rank 2 and rank 3. 

We continue with the setting of the previous chapter. We consider a com- 
plex semisimple Lie algebra g C gl(n;C). We choose, once and for all, a 
compact real form € of g and a maximal commutative subalgebra t of t, and 
we consider the Cartan subalgebra h = t + it of g. We also choose an inner 
product on g that is invariant under the adjoint action of K C GL(n;C) and 
that takes real values on €. 

We let R C it C h be the set of roots in the sense of Notation 6.24. We 
choose, once and for all, a base A for R (in the sense of Definition 6.35), 
the elements of which are called the positive simple roots. Every root is then 
either positive or negative (with respect to A) in the sense of Definition 6.35. 
We let W denote the Weyl group, which may be thought of (Theorem 6.33) 
as the group of linear transformations of h generated by the reflections wa, 
a € R. The set of roots has all the properties of a “root system,” listed in 
Theorem 6.34. 

We consider also the co-roots. For each a, there exist (Theorem 6.20) 
Xa € Bas Ya € B-a, and Ha € h such that [Ha, Xa] = 2Xa, [Ha Ya] = —2Yq, 
and [Xq, Ya] = Ha. The element Ha is unique (independent of the choice of 
Xa and Y,) and is called the co-root associated to the root a. According to 
Section 6.5, the roots and co-roots are related by the formulas 


Q 
Ay ce (7.1) 
and F 
a= FAY (7.2) 


In particular, (a, Hy) = 2. The set of co-roots also constitutes a root system, 
and the set of Ha, a € A, forms a base for the system of co-roots. 


7.1 Integral and Dominant Integral Elements 


Definition 7.1. An element u of h is called an integral element if (p, Ha) 
is an integer for each root a. 


As explained in the next section, the integral elements are precisely the 
elements of that arise as weights of finite-dimensional representations of g. 


Proposition 7.2. The set of integral elements is invariant under the action 
of the Weyl group. 
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Proof. Suppose that u € § is an integral element and that w is an element 
of the Weyl group. Then, since the inner product on is invariant under the 
action of the Weyl group, we have for any root a, (w+, Ha) = (u,w - Ha). 
Since the set of co-roots is invariant under the Weyl group, w~!-H, is another 
co-root (namely H,,-1.,) and, so, (u, w~!- Ha) is an integer. This shows that 
(wp, Ha) is an integer and thus that w - u is an integral element. o 


Checking that (u, Ha) is an integer for every root œ is a rather tiresome 
process. Fortunately, it suffices to check just for the positive simple roots. 


Theorem 7.3. If u is an element of h for which (u, Ha) is an integer for all 
positive simple roots a, then (u, Ha) is an integer for all roots a and, thus, p 
is an integral element. 


Proof. Suppose a1,...,Q, are the positive simple roots. According to Propo- 
sition 6.37 (which is proved in Chapter 8), Ha,,...,Ha, form a base for the 
system of co-roots. This means that for any root a, the co-root H, can be 


expressed as a linear combination of H,,,...,Hq, with integer coefficients. 
Thus, if (u, Ha, is an integer for each j = 1,...,r, then (u, Ha) is an integer 
for all roots a. o 


Recalling the expression (7.1) for Ha in terms of a, we may restate The- 
orem 7.3 as follows. 


Theorem 7.4. An element u of b is integral if and only if 


(u,a) 
ETA 


is an integer for each positive simple root a. 
Corollary 7.5. Every root is an integral element. 


Recall now from elementary linear algebra that if u and a are any two 
elements of an inner-product space, then the orthogonal projection of u onto 
a is given by 

(a4) a. 

(a, a) 
Thus, we may reformulate the notion of an integral element yet again as 
follows. 


Theorem 7.6. An element u of h is integral if and only if the orthogonal 
projection of u onto each positive simple root a is an integer or half-integer 
multiple of a. 


This characterization of the integral elements will help us visualize graph- 
ically what the set of integral elements looks like in examples. (See Sections 
8.5 and 8.6.) All of these reformulations of the notion of an integral element 
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should not cause us to lose sight of the “real” definition of integrality, which 
is that (u, Ha) be an integer for each positive simple root a and, therefore, by 
Proposition 6.37 for every root a. It is this form of integrality that explains 
why the weights of a finite-dimensional representation of g must be integral, 
as we will see in the next section. 

We now turn to the elements that will arise as the highest weights of 
finite-dimensional irreducible representations of g. 


Definition 7.7. An element u of h is called a dominant integral element if 
(u, Ha) is a non-negative integer for each positive simple root a. Equivalently 
L ts a dominant integral element if 


is a non-negative integer for each positive simple root a. 


If u is dominant integral, then (u, Ha) will automatically be a non-negative 
integer for each positive root a, not just the positive simple ones. 


Definition 7.8. The set of p € it C h such that (u,a) > O for all positive 
simple roots a is called the closed fundamental Weyl chamber relative to 
the given set of positive simple roots. 


The dominant integral elements are precisely those integral elements con- 
tained in the closed fundamental Weyl chamber. In the case of sI(3;C), the 
fundamental Weyl chamber is a 60° sector—see Figure 5.2. 

We have observed that every root is an integral element. It follows that any 
linear combination of roots with integer coefficients is also an integral element. 
We can ask whether the reverse holds: Is every integral element expressible as 
a linear combination of roots with integer coefficients? The answer in general 
is no. This matter is discussed further in Section 8.10. 


7.2 The Theorem of the Highest Weight 


We continue with the notation established at the beginning of this chapter. 
We begin with elementary properties of the representations of g. 


Definition 7.9. Suppose t is a finite-dimensional representation of g on a 
vector space V. Then, u € h is called a weight for x if there exists a nonzero 
vector v in V such that 

m(H)v = (p, H) v (7.3) 


for all H € h. A nonzero vector v satisfying (7.3) is called a weight vector 
for the weight u, and the set of all vectors satisfying (1.3) (zero or nonzero) 
is called the weight space with weight u. The dimension of the weight space 
is called the multiplicity of the weight. 
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To understand this definition, suppose that v € V is a simultaneous eigen- 
vector for each 7(H), H € b. This means that for each H € b, there is a 
number Ay such that 7(H)v = Apv. Since the representation (H) is linear 
in H, the Ay’s must depend linearly on H as well; that is, the map H > Ay 
is a linear functional on h. Then (Section B.7), there is a unique element u of 
b such that Ay = (u, H). Thus, a weight vector is nothing but a simultaneous 
eigenvector for all the 7(H)’s and the vector pu is simply a convenient way of 
encoding the eigenvalues. Note that the roots (in the sense of Notation 6.24) 
are precisely the nonzero weights of the adjoint representation of g. 

It is easily shown that two equivalent representations have the same 
weights and multiplicities. 


Proposition 7.10. If u € h is a weight of some finite-dimensional represen- 
tation (7,V) of g, then u is an integral element in the sense of Definition 
7.1. 


Proof. Each co-root Hg, is part of an sl(2;C)-subalgebra {Xo, Ya, Ha} (The- 
orem 6.20). The restriction of 7 to this subalgebra is a finite-dimensional 
representation of sl(2; C), and in any such representation, the eigenvalues of 
Ha must be integers (Theorem 4.12). If u is a weight of m, then, by (7.3), 
(u, Ha) is an eigenvalue for Ha in 7, and, so, (u, Ha) must be an integer. O 


It is true, although by no means obvious, that every integral element ac- 
tually arises as a weight of some finite-dimensional representation of g. See 
the discussion following Theorem 7.15. 

We now observe, as in the sl(3;C) case, that the root vectors shift the 
weights by the corresponding root. 


Proposition 7.11. Suppose that v is a weight vector with weight u and sup- 
pose that Xa is an element of the root space ga. Then, for all H in b, we 
have 

™(H)r(Xa)u = ((u, H) + (a, H))m(Xa)v; 


that is, either n(Xa)v is zero or n(Xa)w is a weight vector with weight u + a. 


Proof. This is proved in the same way as for the case of sl(3;C). Since 
[H, Xa] = (a, H) Xa, we have 
m(H)r(Xq)u = [n(Xa)n(H) + 7([H, Xav 
oa [n(Xa)r H + (a, H} 1(Xq)]v 
= [(u, H) + (a, H)|r(Xa)v. 


In the first equality, we have used that [7(H),7(Xa)] = 7((H, Xa]). Oo 


Proposition 7.12. Every finite-dimensional representation (n, V) is the di- 
rect sum of its weight spaces; that is, the set of operators of the form 7(H), 
H € b, are simultaneously diagonalizable in every finite-dimensional repre- 
sentation. 
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Proof. By complete reducibility, the representation decomposes as a direct 
sum of irreducible representations, and so it suffices to prove the result in the 
irreducible case. Thus, we assume now that 7 is irreducible. Let U be the span 
of all the weight spaces in V; that is, U is the space of all vectors u € V such 
that u can be written as a linear combination of weight vectors. In light of 
Proposition B.14, U is actually the direct sum of all the weight spaces in V. 
Proposition B.10 tells us that any commuting family of operators on a finite- 
dimensional complex vector space has at least one simultaneous eigenvector. 
Applying this to the operators 7(H), H € 6, we'see that V has at least one 
weight vector, which means that U # {0}. 

I claim now that U is invariant under the action of g. Clearly, U is invariant 
under (H), H € 6, and by Proposition 7.11, U is invariant under each of 
the root spaces gq. Since g is the direct sum of h and the root spaces, U 
is invariant under g. Then, since we are assuming V is irreducible and since 
U # {0}, we must have U = V. o 


Proposition 7.13. For any finite-dimensional representation + of g, the 
weights of n and their multiplicity are invariant under the action of the Weyl 
group. 


Proof. Recall that we are thinking of the complex semisimple Lie algebra g as 
sitting inside some gl(n;C), that we have chosen a compact real form £ of g, 
and that K denotes the connected Lie subgroup of GL(n; C) whose Lie algebra 
is €. Saying that £ is a compact real form of g means that there exists a simply- 
connected compact matrix Lie group K, whose Lie algebra tı is isomorphic 
to £. Since €; and £ are isomorphic, we can think of € as being the Lie algebra 
of K or as being the Lie algebra of Kı. However, the groups K and Kı need 
not be isomorphic. (For example, we may have K = SO(3) and Kı = SU(2).) 
On the surface of things, it appears that the notion of the Weyl group might 
depend on whether we think of £ as the Lie algebra of K or of Kı. However, 
Theorem 6.33 tells us that we do get the same Weyl group either way, namely 
the group generated by the reflections wa. With this in mind, we choose to 
think of € as the Lie algebra of the simply-connected group Kı. Then, there 
is a representation II of Kı such that I(exp X) = expa(X) for all X in t 
(which we identify with t1). 

Now, let w be an element of W and let A be an element of N(t) that 
represents it. If v is a weight vector with weight u, consider II(A)v. We have 


m(H)IL(A)v = TI(A)T(A)~'2(H)I(A)v 
= N(A)r(Ad4-1(H))v 

= (u, Ad4-1(H)) I(A)v 
= (Ada (4), H) TI(A)o. 


In the second equality we have used Point 1 of Theorem 2.21 and in the last 
equality we have used the invariance of the inner product under the adjoint 
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action of K. This calculation shows that II(A)v is a weight vector with weight 
Ad4(u) = w-p. The same line of reasoning shows that IT(A) is an isomorphism 
between the weight space with weight u and the weight space with weight w: u, 
and, so, w- u is, again, a weight for m with the same multiplicity as p. o 


Definition 7.14. Let pı and uo be two elements of h. Then, pı is higher 
than uz (or, equivalently, u2 is lower than u) if there exist non-negative real 
numbers ai,...,ar such that 


My — H2 = 4101 + a202 + +++ + Gray, 


where {a1,...,,-} = A is the set of positive simple roots. This relationship 
is written as Hı = H2 OT H2 < Hi. 

If n is a representation of g, then a weight uo for m is said to be a highest 
weight if for all weights u of 7, p < po. 


Theorem 7.15 (Theorem of the Highest Weight). 


1. Every irreducible representation has a highest weight. 

2. Two irreducible representations with the same highest weight are equiva- 
lent. 

3. The highest weight of every irreducible representation is a dominant inte- 
gral element. 

4. Every dominant integral element occurs as the highest weight of an irre- 
ducible representation. 


The proof of the first three points of the theorem is almost precisely as 
in the case of sl(3;C). The proof of Point 4 is substantially more complicated 
than in the sl(3;C) case and is discussed at length in the following sections. 

It follows from Theorem 7.15 and properties of the Weyl group that ev- 
ery integral element occurs as a weight of some finite-dimensional irreducible 
representation of g. Specifically, if u is an integral element then (Section 8.7) 
there exists w € W such that uo := w- yp is a dominant integral element. 
Suppose V is the irreducible representation with highest weight jo. Then by 
Proposition 7.13, p = w7! - po will be a weight of V. 


Definition 7.16. A representation (n, V) of g is said to be a highest weight 
cyclic representation with weight uo if there exists v #0 in V such that 


1. v is a weight vector with weight uo 
2. t(Xq)v = 0 for all positive roots a 
3. the smallest invariant subspace of V containing v is all of V. 


The vector v is called a cyclic vector for 7. 


Proposition 7.17. Let (1,V) be a highest weight cyclic representation of g 
with weight uo. Then: 


1. n has highest weight po. 
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2. The weight space corresponding to the highest weight uo is one dimen- 


sional. 
Proof. Let v be as in the definition. Let a;,...,a, be the positive simple roots 
and let a,41,...,Qm be the remaining positive roots (in any order). For each 


l, choose nonzero elements X; in the root space ga, and Y; in the root space 
g—a,- Consider, then, the subspace U of V spanned by elements of the form 


Us n(Y n (Yia) Da T(Yin ju, (7.4) 


We want to show that U is invariant under the action of g. To show this, it 
suffices to show that U is invariant under each 7(H), H € b, and that for each 
positive root ay, U is invariant under 7(X,) and 7(Y;). We consider the basis 
for g (as a vector space) consisting of Hy,,...,Ha, (where a1,...,@, are the 
positive simple roots) together with the elements X,,...,Xm and Yi,..., Ym. 

Now, we apply to an element u of the form (7.4) an operator of the form 
1(Ha,), (X21), or n(Y). We apply Lemma 5.14 to rewrite the resulting vector 
as a linear combination of terms, each of which has all of the factors of 7(X7) 
to the right (acting first on v), followed by the factors of t(Ha,), followed by 
the factors of 7(Y;). As in the sl(3;C) case, any term that actually has any 
factors of 1(X7) acting on v will be zero. In the remaining terms, each factor 
of n(Ha,) will hit v first and will give back just a constant times v. Thus, only 
the factors of 7(¥;) will remain and we obtain a linear combination of factors 
of the form (7.4). Thus, the vector that we obtain by applying 7(Ha,), 7(X1), 
or 7(Y;) to u is, again, in the space U. 

The space U is invariant, and, by definition, it contains the vector v (taking 
N = 0 in (7.4)). So, by the definition of a highest weight cyclic representation, 
U must be all of V. Thus, every element of V is a linear combination of vectors 
of the form (7.4). However, by Proposition 7.11, each vector of the form (7.4) 
is either zero or a weight vector with weight 4o — a1, — --: — Quy, which is 
lower than or equal to uo. So, uo is the highest weight for V. 

Furthermore, every element of V is a linear combination of v itself (the 
terms with N = 0) and weight vectors with weight strictly lower than po 
(the terms with N > 0). It then follows from Proposition B.14 that the only 
weight vectors with weight fo are multiples of v, and, so, the weight space 
with weight uo is one dimensional. o 


Proposition 7.18. Every irreducible representation of g is a highest weight 
cyclic representation, with a unique highest weight uo. 


Proof. Uniqueness is immediate, since by the previous proposition, jo is the 
highest weight, and two distinct weights cannot both be highest. 

We have already shown that every irreducible representation is the direct 
sum of its weight spaces. Since the representation is finite dimensional, there 
can be only finitely many weights. It follows that there must be a maximal 
weight (i.e., a weight uo such that there is no weight u # uo with u > po). 
That being the case, we must have 
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m(Xq)u =0 


for each element Xa of a root space ga corresponding to a positive root a. (If 
not, then T(Xa)v would be a weight vector with weight uo + a > no.) 

Since 7 is assumed irreducible, the smallest invariant subspace containing 
v must be the whole space; therefore, the representation is highest weight 
cyclic. o 


Proposition 7.19. Every highest weight cyclic representation of g is irre- 
ducible. 


Proof. Let (n, V) be a highest weight cyclic representation with highest weight 
po and cyclic vector v. By complete reducibility, V decomposes as a direct 
sum of irreducible representations 


v= Qr.. 


By Proposition 7.12, each of the V;’s is the direct sum of its weight spaces. 
Thus, since the weight uo occurs in V, it must occur in some V;. On the 
other hand, Proposition 7.17 says that the weight space corresponding to po 
is one dimensional; that is, v is (up to a constant) the only vector in V with 
weight uo. Thus, V; must contain v. However, then, V; is an invariant subspace 
containing v, so V; = V. Thus, there is only one term in the sum (5.9), and 
V is irreducible. o 


Proposition 7.20. Two irreducible representations of g with the same highest 
weight are equivalent. 


Proof. We now know that a representation is irreducible if and only if it is 
highest weight cyclic. Suppose that (m, V) and (o, X) are two such represen- 
tations with the same highest weight uo. Let v and w be highest weight cyclic 
vectors for V and X, respectively. Now, consider the representation V @ X, 
and let U be smallest invariant subspace of V @ X that contains the vector 
(v, w). 

The weights occurring in V ® X are simply the weights of V together 
with the weights of X. This means that uo is the highest weight occurring in 
V @ X. Since (v, w) is a weight vector with weight uo and since this vector 
generates U (by definition), U is a highest weight cyclic representation, and, 
therefore, irreducible by Proposition 7.19. Consider the two “projection” maps 
P: V@X >V, Pi(v,w) =v and P: VEX >X, Pi(v,w) = w. It is easy 
to check that P; and P are intertwining maps of representations. Therefore, 
the restrictions of P) and P to U CV @ X will also be intertwining maps. 

Clearly, neither P;|,, nor Pz|y is the zero map (since both are nonzero 
on (v,w)). Moreover, U, V, and X are all irreducible. Therefore, by Schur’s 
Lemma, P;|y is an isomorphism of U with V, and Pz|y is an isomorphism of 
U with X. Thus, V S U xX. o 
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Proposition 7.21. If is an irreducible representation of g, then the highest 
weight uo of m is a dominant integral element. 


Proof. We know already that all of the weights of m (not just the highest 
weight) must be integral. Now, suppose that po is the highest weight of 7 
and that v is a nonzero weight vector with weight uo. Then, 7(Xq)v = 0 for 
all positive simple roots a (otherwise, 7(X.)v would be a weight vector with 
weight higher than uo). Now, consider the subalgebra s% = {Xq, Yo, Ha} of 
g, which is isomorphic to sl(2;C). Then, v is an eigenvector for 7(H.) with 
eigenvalue uo(Ha), and v is annihilated by 7(X,,). By Theorem 4.12, this can 
occur only if po(Ha) is a non-negative integer. Thus, po is dominant integral. 

0 


We have now completed the proof of Theorem 7.15, except for Point 4, 
namely that every dominant integral element arises as the highest weight of 
some irreducible representation. The strategy we used to prove this in the 
case of sl(3;C) does not work in general. We devote the next three sections to 
a discussion of three different proofs of Point 4. 


7.3 Constructing the Representations I: Verma Modules 


We now turn to a discussion of the three standard methods of constructing an 
irreducible finite-dimensional representation having a given dominant integral 
element as its highest weight, namely Verma modules, the Peter-Wey] theory, 
and the Borel—Weil theory. We begin, in this section, with the Verma module 
approach. Given any p in b, we will construct a representation V, called a 
Verma module. (“Module” is just another word for a representation.) Here, p 
can truly be any element of h, not necessarily dominant and not necessarily 
integral. The catch is that the Verma module V, is always infinite dimensional, 
even if u is dominant integral. We will see eventually that if u is dominant 
integral, then V, contains an invariant subspace U, such that the quotient 
space V,,/U,, is finite dimensional and irreducible and has highest weight p. 


7.3.1 Verma modules 


The Verma module is constructed as follows. Let n+ be the subspace of g 
spanned by the root spaces ga where a is a positive root. This is a subalgebra 
of g since [ga,93] C ga+g and if a and 8 are positive roots, then ga+, is either 
zero or is a root space corresponding to the positive root a+ 8. Similarly, let 
n` be the span of the root spaces corresponding to negative roots, which is 
also a subalgebra. Then, g decomposes as a direct sum (in the vector space 
sense) of h, n*, and n~. Now, let u be any element of h. We want to construct 
an (infinite-dimensional) representation V, of g having highest weight u. We 
first describe V, as a vector space and then describe the action of g on it. 
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As in the previous section, we let a;,...,a, be the positive simple roots 
and we let @;41,...,Q@m be the remaining positive roots. We choose nonzero 
elements X; € ga, and Y; E€ g-a, 1 <l < m. We begin with a vector vo that 
will be our highest weight vector. Then, the rest of V,, will be finite linear 
combinations of vectors of the form 


Ty (Yi, Tu (Yie) ce Tu (Yin )vo- (7.5) 


There are certain dependence relations among such vectors that are forced on 
us by the commutation relations in the subalgebra n~. For example, if 7,, is 
going to be a representation, then we must have 


Tyu(Ye)Ty(¥1) = Ty (Yi) Ty (Ye) E TullYk, Yi]), 


where [Y;, Yı] is again in n~ and, therefore, expressible as a linear combination 
of the Y;’s. Thus, 


Ty (Ye)Tu(Yi)v0 = Ty(Yi)tu(Ye)vo + X cer Mu(¥j)v0- 
j=l 


We construct V, as a vector space by imposing only those dependence rela- 
tions among vectors of the form (7.5) that are forced on us by the commutation 
relations of n~. As a vector space, V,, is isomorphic to the “universal envelop- 
ing algebra” U(n—) of n~. (See Section 17.2 of Humphreys (1972).) It follows 
from this that V, is always infinite-dimensional. 

We now want to describe an action of g on this space. For Y € n7, 7,,(Y) 
acts in the only possible way, namely by adding on one more factor of 7,,(Y) 
on the left. For the action of h, we decree that vp be a weight vector with 
weight ju: 

nu(H)vo = (u, H) vo, Heb. (7.6) 


By Proposition 7.11 (the proof of which is perfectly valid even for infinite- 
dimensional representations), each 7,,(Y;) lowers the weight of vo by a, and 
so we must have 


Ty (A) ry (Yi, )tylYiz) ae Tu (Yin Vo (7.7) 
= (u(H) — a1, (H) = Oin (H)) ty (Yn ty (Yn) Ca: Tu (Yin vo- 


It remains then to describe the action of nt on g. If u is actually going to 
be the highest weight occurring in V,,, then we must have 


m,(X)u9 = 0 (7.8) 


for all X € n*. Then, if we want to apply m (X), X € nt, to an element of 
the form (7.5), we apply Lemma 5.14. The lemma allows us to rewrite 


Tu (Xt (Yi, ra (Yn) a ty (Vin) (7.9) 
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as a linear combination of terms in which the elements of n* are to the right 
(acting first), the elements of h are next, and the elements of n~ are to the 
left (acting last). If we apply all of the resulting terms to vo, any terms which 
actually contain any factors from n* must be zero, by (7.8). All of the other 
terms involve only factors from h and n7, with the elements of h acting first. 
When we apply such a term to vo, the factors from simply give constants, 
because of (7.6). This leaves us again with a linear combination of terms 
of the form (7.5), which means that we have a constructive procedure for 
determining how 7,,(X) acts on V,,. 

It is not completely clear that this procedure really yields a well-defined 
representation of g. The action of n* is the most problematic in this regard; 
there are many different ways to commute the factors in (7.9) into the desired 
form and one needs to know that the value of 


Tul X ru (Yi, na (Yi) cos Tu(Yin Vo 


is the same no matter which way is used. Nevertheless, fairly elementary 
algebraic means (Section 20.3 of Humphreys (1972)) can be used to show 
that the Verma module is well defined. 

It is important to note that the Verma module is a representation of the Lie 
algebra g only—there is no associated representation of the simply-connected 
compact group K. Although in the finite-dimensional case that every repre- 
sentation of g comes from a representation of K, this result does not general- 
ize to the infinite-dimensional case. What goes wrong is that in the infinite- 
dimensional case, the exponential of an operator may not be defined, because 
the series defining the exponential may not converge in any reasonable sense. 
We will revisit this issue in the next subsection. 


7.3.2 Irreducible quotient modules 


The good thing about Verma modules is that it is fairly easy to prove they 
exist. The bad thing is that they are always infinite dimensional, even when 
the highest weight u is dominant integral. The strategy for constructing finite- 
dimensional representations is first to show that every Verma module has a 
largest proper invariant subspace U,, and that the quotient space V,,/U,, is 
irreducible with highest weight u. This much is true for any p in h. Then, one 
shows that in the case that u is dominant integral, the quotient space is finite 
dimensional. 

Let us look into this strategy in greater detail. The invariant subspace U,, 
is defined as follows. It follows from (7.7) and Proposition B.14 that V, is a 
direct sum of its weight spaces. Thus, for any vector v in V,,, it makes sense 
to talk about the component of v in the one-dimensional subspace spanned 
by. vo, which we refer to as the vg-component of v. 


Definition 7.22. Given a Verma module V, let U, be the subspace of V, 
consisting of all vectors v such that the vg-component of v is zero and such 
that the vo-component of 
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1 l 
Tul X) TX w 
is also zero for any collection of vectors X!,..., X} innt. 


Certainly the zero vector is in U„; for some g’s and pw’s, it happens that 
U, = {0}. 


Proposition 7.23. The space U, is an invariant subspace for the action of 
g- 


Proof. Suppose that v is in U, and that Z is some element of g. We want to 
show that 7,,(Z)v is also in U,,. Thus, we consider 


Tul X’) +++ my (Xa, (Z)v (7.10) 


and we must show that the vp-component of this vector is zero. Using Lemma 
5.14, we may rewrite the vector in (7.10) as a linear combination of vectors of 
the form 


TaY) ++ my (Y nuE) -- +2, (Hm (X!) ++ ay (X™)v, (7.11) 


where the Y’s are in n`, the H’s are in h, and the X’s are in nt. However, 
since v is in U,,, the vo-component of 


Ty (X1) +++ my (X™)v (7.12) 


is zero, and thus this vector is a linear combination of weight vectors with 
weight lower than u. Then, applying elements of h and n~ to the vector 
in (7.12) will only keep the weights the same or lower them. Thus, the vo- 
component of the vector in (7.11), and hence also the vop-component of the 
vector in (7.10), is zero. This shows that 7,,(Z)v is, again, in U,. o 


If V is any vector space and U is a subspace of V, then one can form the 
quotient space V/U. The construction of V/U is analogous to the construc- 
tion of quotient groups, as described in Appendix A. We define two elements 
of V to be equivalent if their difference is an element of U and then V/U 
is defined to be the set of equivalence classes. Because U is a subspace, the 
vector space operations on V (addition and scalar multiplication) “descend” 
unambiguously to equivalence classes and make V/U into a vector space. If V 
carries a representation of some Lie algebra g and if U is an invariant subspace 
of V, then the action of g on V descends to an action on V/U and, thus, the 
space V/U carries a representation of g, called the quotient representation. 
We apply the quotient construction to the Verma module V, and the invariant 
subspace U,,. 


Proposition 7.24. The quotient space V,,/U,, is an irreducible representation 
of g. 
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Proof. A simple argument shows that the invariant subspaces of the represen- 
tation V,,/U, are in one-to-one correspondence with the invariant subspaces 
of V, that contain U,,. So, to prove that V,/U, is irreducible is equivalent to 
showing that any invariant subspace of V, that contains U, is either U, or 
V,,. Suppose, then, that W is an invariant subspace that contains U, and sup- 
pose that W # U, (i.e., that W contains at least one vector v not contained 
in U,). This means that W contains a vector u = 7,(X!)---7,(X!)uv whose 
vo-component is nonzero. 

I claim then that W must contain vo itself. To see this, we decompose u as 
a nonzero multiple of vg plus a sum of weight vectors corresponding to weights 
A Æ n. Since À Æ u, we can find H in b with (A, H) 4 (u, H} and then we may 
apply to u the operator 7,,(H) — (A, H) I. This operator will keep us in W and 
will “kill” the component of u that is in the weight space corresponding to 
the weight \ while leaving the vp-component of u nonzero. We then continue 
applying operators of this form until we have killed all the components of u 
in weight spaces different from p, giving us a nonzero multiple of vo. 

This means that W contains vo and, therefore (in light of (7.5)), all of V,.. 
So, any invariant subspace of V, that properly contains U,, must be equal to 
V„. This shows that V,,/U,, is irreducible. o 


Since for each u € U, the vo-component of u is zero, it is not hard to see 
that the quotient space V,,/U,, still has highest weight u. So, for any p in b 
(not necessarily dominant or integral), we have a method of constructing an 
irreducible representation of g with highest weight u. Of course, we do not 
know that this representation is finite dimensional; indeed, it cannot be finite 
dimensional unless u is dominant integral. Therefore, the crucial last step in 
the argument is to show that in the dominant integral case, the quotient space 
V../U, is finite dimensional. 


7.3.3 Finite-dimensional quotient modules 


The way we will prove finite dimensionality is to show that there is an action 
of the Weyl group on V,,/U, that transforms the weights in the same way 
as in the finite-dimensional case. This will show that the set of weights for 
V,,/U, is invariant under the action of the Weyl group on b. Meanwhile, if p 
is dominant integral, then all of the weights of V,,/U, must be integral (since 
they are of the form in the right-hand side of (7.7)), and all the weights must 
be lower than u. However, standard Weyl group theory implies that there are 
only finitely many integral elements A with the property that w - A is lower 
than y for all w € W. So, if we can show that the Weyl group acts on V,,/U,, 
then we will conclude that there are only finitely many weights in V,/U,. 
Since (even in the Verma module) each weight has finite multiplicity, this will 
show that V,,/U, is finite dimensional. (Note that the set of weights for the 
Verma module itself is never invariant under the action of W on b.) 

How, then, do we construct an action of the Weyl group on V,,/U,,? Recall 
that in the finite-dimensional case we exponentiate each representation 7 of 


7.3 Constructing the Representations I: Verma Modules 205 


g to get a representation II of the simply-connected compact group K. The 
action of the Weyl group is then obtained by restricting II to the subgroup 
N(t) C K. In the infinite-dimensional case, the exponential of an operator 
is not necessarily well defined (since the series for the exponential may not 
converge) and, so, we cannot, in general, obtain a representation of the group 
II. If u is dominant integral, then we will eventually conclude that V,,/U,, is 
finite dimensional, but, of course, we are not allowed to assume that at. this 
stage. 

This means that we need a method of exponentiating operators that can 
be used in a possibly infinite-dimensional space. To do this, we introduce the 
concept of a locally nilpotent operator. A linear operator X on an arbitrary 
vector space V is said to be locally nilpotent if for each v € V, there exists 
a positive integer k such that X*v = 0. If V is finite dimensional, then a 
locally nilpotent operator must actually be nilpotent, that is, there must exist 
a single k such that X*v = 0 for all v. In the infinite-dimensional case, the 
value of k depends on v and there may be no single value of k that works for 
all v. If X is locally nilpotent, then we define e* to be the operator satisfying 


where for each v € V the series on the right terminates. 


Proposition 7.25. For each positive simple root a € A, let Xq be an element 
of ga and let Ya be an element of g-a. If p is dominant integral, then Xa and 
Yq act in a locally nilpotent fashion on the quotient space V,,/U,.- 


We will give the proof of this result at the end of this subsection. Let us 
now continue with the argument for the finite dimensionality of V,,/U,. 


Proposition 7.26. If 4 is dominant integral, then the set of weights for 
V,,/U, is invariant under the action of the Weyl group on b. 


Proof. We make use of a result from Weyl group theory, namely that W is 
generated by the reflections wa, where a ranges over the set A of positive 
simple roots. (See Section 8.7.) So, it suffices to show that the set of weights 
is invariant under each wa, a € A. 

Let 7, denote the action of g on the quotient space V,,/U,. For each 
positive simple root a, let Xa € ga and Ya € g-a be as in Theorem 6.20. 
Since, by Proposition 7.25, 7,(Xq) and 7,(Yq) are locally nilpotent, it makes 
sense to exponentiate these operators. Define, then, operators By on V,,/Uy, 
b 

Ba = efu (Xa) o~Tu(Ya) eřu(Xa), 


This operator is going to describe the action of the Weyl group element wa on 
V,,/U,,. (Compare the expression (6.24) for the Ba’s following Theorem 6.31.) 
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Assume that v is a weight vector in V,,/U, with weight À. We want to 
prove, then, that Bav is a weight vector with weight wa - A, where, as usual, 
Wa is the reflection about the hyperplane perpendicular to a. To do this, it 
suffices to show that 


tu(H)Bo = Bañy(Wa ` H). 


This is really just a Lie algebra calculation; if it is true in sl(2;C), then it is 
true here as well. 

To be a bit more precise about this, let Xa, Ya, and H stand for itu(Xa), 
itu(Yq), and 7,,(H), respectively. Then, we have 


HeXee-YaeXa — e* e` Yee Adz, Ad. Ad,-x, (F). 


Now, the relationship between Ad and ad still holds for locally nilpotent 
operators in the infinite-dimensional case (think of the power series argument 
in Exercise 19 of Chapter 2) and, so, 


Ad,- ža Ad,v, Ado- ža (H) = e784 ta e? e784 Xa (F), (7.13) 


If H is such that (a, H) = 0, then (7.13) is simply equal to H. If H = Ha, 
then all of (7.13) is taking place in a three-dimensional Lie algebra isomorphic 
to sl(2;C); thus, the answer is the same as in the sl(2; C) case, namely —Ha. 
In either case (check), (7.13) is equal to 7,,(wa - H). (Compare Exercise 8 in 
Chapter 6.) o 


In the dominant integral case, the weights for V,,/U, are invariant under 
the action of the Weyl group and all of the weights are integral (since the 
weights that occur differ from yz by an integer linear combination of roots). A 
standard result from Weyl group theory (see Section 8.7) says that there are 
only finitely many integral elements À such that w - À is lower than y for all 
w € W. So, we conclude that if u is dominant integral, then there are only 
finitely many weights in V,,/U,. 

Meanwhile, for any u € h, we know that V,,/U, has at least one weight 
space, namely the one with weight u. (This weight space survives the passage 
from V, to V,,/U,, because the elements of U, have no vp-component.) Since, 
as we have shown, V,,/U,, is irreducible, the same argument as in the finite- 
dimensional case shows that V,,/U, is the direct sum of its weight spaces. 
Furthermore, all of the weights for V,, and so also for V,/U,, have finite 
multiplicity. 

We conclude, then, that in the dominant integral case, V,,/U,, is the direct 
sum of its weight spaces, there are only finitely many of these weight spaces, 
and each of the weight spaces has finite dimension. This shows that V,,/U,, is 
finite dimensional and establishes that each dominant integral element arises 
as the highest weight of a finite-dimensional irreducible representation of g. 

It now remains only to provide the proof of Proposition 7.25. 
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Proof. As in the previous proof, we use X as an abbreviation for p(X), for 
any X € g. We also make use of the following standard result from the theory 
of semisimple Lie algebras: The subalgebra of g generated by the spaces ga, 
with a in A, is equal to the span of the spaces ga, where a ranges over all of 
Rt. This follows from Proposition 8.4(d) and the corollary to Lemma 10.2A 
of Humphreys (1972). (See also Proposition 14.2 in Humphreys (1972).) For 
each root a, we let s* denote the three-dimensional subalgebra { Xo, Ya, Ha} 
given by Theorem 6.20. 


Step 1. For each a € A, Xq is locally nilpotent. Every vector v € Va, 
and so also every vector in V,,/U,,, is a finite linear combination of vectors of 
the form (7.5) and (by (7.7)) these vectors are weight vectors. Applying Xq 
repeatedly will raise all the weights until they are no longer lower than u and, 
at that point, X*v must be zero. 


Step 2. For each positive simple root a, there exists a nonzero finite- 
dimensional subspace of V,,/U, that is invariant under s®. Let 


m= (n, Ha) , 
which is a non-negative integer because jz is dominant integral. Now, consider 
the vectors Yu, k = 0,1,2,.... Then, from the calculations in Chapter 4, 
we have 
HoY Fup = (m — 2k)Ý Evo, (7.14) 
XoVFup = k(m +1 — k)\Ýf- vo. (7.15) 


In particular, Xa Ý +t vo = 0. 

Now, consider 8 € A with 8 # a. Then, I claim that [Xg, Ýa] = 0. To 
see this, note that Ya € g-a and Xg € gg, and, therefore, [Xg, Ya] € g-a- 
However, 3 — œ is nonzero and cannot be a root. After all, every root has a 
unique expansion in terms of elements of A and in this expansion all nonzero 
coefficients have the same sign, whereas 8 — a has one positive coefficient and 
one negative coefficient. It follows that gg-a = {0}; thus [Xg, Ya] = 0 and so, 
also, [Xg, Ya] = 0. This being the case, we have XgY7"*!up = V+1Xgu9 = 0. 
Thus, rtv is annihilated by all of the Xj’ g's, B € A, and so also by all of 
the Xg’ g’s, B E R* (by the result at the beginning of this proof). 

Now, if Y"+1u9 were nonzero, then since Yirtly, is annihilated by all of 
the Xe s, with 6 in Rt, the proof of Proposition 7.17 would tell us that the 
smallest g-invariant subspace containing Yr+lyp would have highest weight 
—(m-+1)a. Since, on the contrary, V,,/U,, is irreducible with highest weight 
p, we must have Y2"+!u9 = 0. This (together with (7.14) and (7.15)) tells us 
that the space spanned by vo, Ýavo, EE Ýr vo is invariant under s®. 


Step 3. Given any a € A, every vector in V,,/U is contained in a finite- 
dimensional s°-invariant subspace. Call a vector v € V,,/U,, s*-finite if v is 
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contained in a finite-dimensional s*-invariant subspace. Then, define a sub- 
space Ta of V,/U,, by 


Ty = {v € V,/U,|v is s°-finite} . 


Step 2 shows that Te # {0}. I claim that Tẹ is invariant under g. To see 
this, consider v € T, and let T be a finite-dimensional s*-invariant subspace 
containing v. Now, let (Zien? be a basis for g and let T’ be the sum of 
the spaces Z;,T. Since g is finite dimensional, T’ is also finite dimensional. 
However, now we can see that T’ is invariant under s°, since for Z® any 
element of 5%, 


L° LT = ZyZ°T + [Z*, Zx| T, 


and Z°T C T and [Z®, Z,] is a linear combination of the Z)’s. So, T’ is a 
finite-dimensional s*-invariant subspace that contains Zv for all Z € g. This 
shows that Zv is again in Tą, and, so, Ty is invariant under g. 

Since V,,/U,, is irreducible and Tą is nonzero and g-invariant, we conclude 
that Te = V,/Uz. 


Step 4. For each a € A, Yq is locally nilpotent. Given any v € V,,/U,, 
v is contained in a finite-dimensional s®-invariant subspace T. By complete 
reducibility for sl(2;C), T decomposes as a direct sum of (finitely many) ir- 
reducible s°-invariant subspaces. In each of these irreducible spaces, we have 
completely worked out (in Chapter 4) the action of Ya and this action is 
nilpotent. So, Yeu = 0, where k is the maximum of the dimensions of the 
irreducible summands in T. 


This concludes the proof of Proposition 7.25. o 


7.3.4 The sl(2;C) case 


Let us see how this all works out in the case of sl(2;C). If X, Y, and H are 
the usual basis elements, we work with the Cartan subalgebra h = span(H). 
Weights may then be thought of simply as eigenvalues for 7,,(H). We build 
a vector space containing linearly independent vectors vp, v1, V2,..., and we 
define V, as the space of finite linear combinations of these vectors. Here, u 
is an arbitrary complex number and the vector space itself does not depend 
on p. 

We now describe an action of sl(2;C) on this space as follows. For H, we 
define 


Ty (H)vo = uvo, 
Ta (Hug = (u — 2k)vp. 
For Y, we define 
Ty (Y )vp = Uk+1- 


Finally, for X, we define 
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Ty(X vo = 0, (7.16 
Ty(X)up = k(u +1 -— k)vk-1, k21. (7.17) 


The calculations of Chapter 4 show that if we define n, (H) and 7,(Y) as 
above, then we must define 7,,(X) as above, if t (X), m,(Y), and 7,,(H) are 
to satisfy the sl(2;C) commutation relations. Direct calculation then shows 
that with these definitions the operators really do satisfy these commutation 
relations. 

Let us now compute the invariant subspace U,,. If we begin with the vector 
Vk, then using (7.17) repeatedly, we obtain 


k 
Ty (X) vk = (Tl +1- D) vo. (7.18) 


l=1 


If u = l — 1 for any l in the range 1,...,k, then the coefficient of vo will be 
nonzero. So, if u is anything ot’ er than a non-negative integer, the coefficient 
of vo will always be nonzero, ai.d from this it follows easily that U, = {0}. In 
that case, V,,/U, = V, will be infinite dimensional. 

On the other hand, if u = m, where m is a non-negative integer, then the 
coefficient of vo in (7.18) will be zero for all k > m. In this case, U, = Um 
will consist of all linear combinations of the v,’s with k > m. The quotient 
space Vm/Um can then be identified with the span of vp,...,Um and is finite 
dimensional. 

Note that the Weyl group for this problem is simply {J,—J}. The eigen- 
values A of H occurring in V, are definitely not invariant under A > —A. 
Nevertheless, in the case u = m, the eigenvalues for H occurring in Vin /Um 
are m,m — 2,...,—m and this set of eigenvalues is invariant under À > —X. 


7.4 Constructing the Representations II: The 
Peter—Weyl Theorem 


In this approach, we construct the representations as representations of the 
simply-connected compact group K whose complexified Lie algebra is g. We 
make use of the Haar measure on K, used already in the proof of complete 
reducibility. (See Section C.4.) The Haar measure is a finite measure on K that 
is invariant under the left and right actions of K. We normalize the measure 
so that u(K) = 1. We then consider the Hilbert space L?(K, p) consisting of 
measurable complex-valued functions f on K with the property that 


| IF(a)P dul) < o. 
K 


The finite-dimensional irreducible representations of K will ultimately be re- 
alized as certain finite-dimensional subspaces of L?(K, p). The construction of 
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the representations is based on three main results: the Peter-Weyl theorem, 
the Weyl character formula, and the Weyl integral formula. (See Section 7.6 
for more information on some of these results.) 


7.4.1 The Peter—-Weyl theorem 


In this subsection, it is not necessary that K be a compact Lie group; any 
compact topological group will do. We consider the Hilbert space L?(K, u), 
the space of measurable functions on K that are square-integrable with respect 
to the Haar measure on K. 

The Peter-Weyl theorem gives a decomposition of L?(K, p) into finite- 
dimensional subspaces that are invariant under the left and right actions of 
K. Specifically, if © is a finite-dimensional irreducible representation of K 
acting on a vector space V, then we consider the space of matrix entries of 
X. Suppose we choose a basis {ux} for V. Then, for each x € K, the linear 
operator U(x) can be expressed as a matrix with respect to this basis; we 
denote the entries of this matrix as U(ax),;. Then, a matrix entry for ® is a 
function on K that can be expressed in the form 


dim V 


F(x) = 3 oun (7.19) 


kl=1 


for some set of constants az). 
We can describe the space of matrix entries in a basis-independent way as 
the space of functions that can be expressed in the form 


f(x) = trace(X(x)A) (7.20) 


for some linear operator A on V. To see the equivalence of these two forms, 
let Az; be the matrix for the operator A in the basis {ux}. Then, the matrix 
for U(x) A is given by the matrix product (E(x) A)kı = X m U(©)kmAmi and, 
so 

i dim V 

trace(Z(x)A) = XO E(£)kmAmk. 
k,m=1 

Thus, every function of the form (7.20) can be expressed in form (7.19) with 
Qk = Aix, and vice versa. 

The significance of the space of matrix entries is that it is a finite- 
dimensional space of functions on K that is invariant under both left and 
right translations by K. To understand what this means, suppose f is a ma- 
trix entry for a representation ©, given, say, as in (7.20). Now, suppose we 
define a new function fy, by shifting f on the left by yı and on the right 
by yo; that is, we set fy, ya (£) = f(y1£y2). Then, we have that 


fonva (2) = trace(E(y1)5(«)E(y2)A) 
= trace(3(2)[5(y2)A¥(y1)]). 
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Thus, fy,,y. is, again, a matrix entry for 4, with A replaced by U(y2)AX(y1). 
Of particular importance among the matrix entries is the character of 
the representation ©, denoted yy, which is the function on K given by 


xz(z) = trace(X(z)). 


This function is a matrix entry, obtained by taking A = I in (7.20) or taking 
Ox, = Ôkı in (7.19). The character is special because it satisfies 


xx(aye™*) = trace(X(x)E(y)X(x)~*) = trace(Z(y)) = xs(y) (7-21) 


for all z and y in K. Any function f on K satisfying f(ryr~!) = f(y) for all 
x and y in K is called a class function. The reason for this terminology is 
that the set of group elements of the form zyr~', with y € K fixed and z 
ranging over K, is called the conjugacy class of y. A class function is then 
a function that is constant on each conjugacy class. 

It is easily verified that two equivalent representations have the same char- 
acter. The converse of this is much less obvious but also true: If two finite- 
dimensional representations of K have the same character, they are equivalent. 

The Peter-Weyl theorem gives a way of expressing any function f € 
L?(K,) in a series expansion in terms of matrix entries of the irreducible 
representations of K. We are interested primarily in the case in which f is a 
class function. In that case, the expansion involves only the characters. The 
Peter-Weyl theorem, specialized to the case of class functions, is as follows. 


Theorem 7.27 (Peter-Weyl). Let L*(K,)* denote the subspace of the 
Hilbert space L?(K, p) consisting of square-integrable class functions. Then, 
the functions 


X= 


form an orthonormal basis for L?(K, u)* , where © ranges over the equivalence 
classes of irreducible finite-dimensional representations of K. ; 


Proving that the characters form an orthonormal set of functions in 
L?(K,)* is a fairly elementary calculation using little more than Schur’s 
Lemma. (See Section II.4 of Brécker and tom Dieck (1985).) Proving that the 
characters form an orthonormal basis for L?(K,)* requires some analytical 
argument. (See Section III.3 of Brécker and tom Dieck (1985).) 


7.4.2 The Weyl character formula 


We now assume that K is a simply-connected compact Lie group. (There is 
also a version of the result for connected compact Lie groups that are not 
simply connected.) We choose, as usual, a maximal commutative subalgebra 
t of £ and we let T be the connected Lie subgroup of K whose Lie algebra is t. 
It can be shown that T is a closed subgroup of K (called a “maximal torus”). 
It can further be shown that every element of K is conjugate to an element 
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of T. This means that the values of a class function on K are, in principle, 
determined by its values on T. The Weyl character formula is a formula for 
the restriction to T of the character of an irreducible representation of K. 

We let g denote the complexification of the Lie algebra t of K, so that g 
is a complex semisimple Lie algebra. Then, h := t+ it is a Cartan subalgebra 
in g. We follow Notation 6.24 and regard the roots as elements of h (not *). 
Proposition 6.15 (expressed in terms of Notation 6.24) tells us that if a € h 
is a root, then (a, H) is imaginary for all H in t, which means that qa itself is 
in it. It is then convenient to introduce the real roots, which are simply 1/2 
times the ordinary roots. This means that a real root is a nonzero element a 
of t with the property that there exists a nonzero X in g with 


[H, X] = ila, H)X 


for all H in t (or, equivalently, for all H in b). We can also introduce the real 
co-roots as the elements of t of the form Ha = 2a/(a,a), where a is a real 
root. 

In the same way, we will consider the real weights, which we think of as 
elements of t in the same way as for the roots. So, if (©, V) is an irreducible 
representation, then an element p of t is called a real weight for © if there 
exists a nonzero vector v € V such that 


o(H)v = il(u, A)v 


for all H in t. (Here, ø is the Lie algebra representation associated to the 
group representation X.) An element p of t is said to be integral if (u, Ha) 
is an integer for each real co-root Ha. (All of the “real” objects are simply 
1/i times the corresponding objects without the qualifier “real.” ) The real 
weights of any finite-dimensional representation of g must be integral. 

For the rest of this section, all of roots and weights will be assumed real, 
even if this is not explicitly stated. 

If a is an integral element, then it can be shown that there is a function 
f on T satisfying 

fle) = ee) (7.22) 


for all H € h. To understand this assertion, note that because T is connected 
and commutative, every element t of T can be expressed as t = e” (Exercise 
25 from Chapter 2). However, a given t can be expressed as t = e” in many 
different ways; the content of the above assertion is that the right side of (7.22) 
is independent of the choice of H for a given t. This means that we want to 
say that the right-hand side of (7.22) defines a function on T, not just on t. 
We will discuss this point further in the next subsection. 

Next, we introduce the element 6 of t defined to be half the sum of the 


positive roots: 
1 


aeRt 
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It can be shown that 6 is an integral element. (Clearly, 26 is integral, but it is 
not obvious that 6 itself is integral.) Finally, if w is any element of the Weyl 
group, we think of w as an orthogonal linear transformation of t—in which 
case, det(w) = +1. 

We are now ready to state the Weyl character formula. 


Theorem 7.28 (Weyl Character Formula). Jf £ is an irreducible repre- 
sentation of K with highest real weight u, then we have 


z(e") = 7.23 
X= (e ) pew det(w) eilw-6,H) ( ) 
for all H in t for which the denominator of the right-hand side of (7.23) is 


nonzero. Here, 6 denotes half the sum of the positive real roots. 


The set of points H for which the denominator of the Weyl character 
formula (the so-called Weyl denominator) is nonzero is dense in t. At points 
where the denominator is zero, there is an apparent singularity in the formula 
for yy. However, actually at such points the numerator is also zero and the 
character itself remains finite (as must be the case since, from the definition 
of the character, it is well defined and finite at every point). Note that the 
character formula gives a formula for the restriction of yy to T. Since xs is a 
class function and since (as we have asserted but not proved) every element of 
K is conjugate to an element of T, knowing yy on T determines, in principle, 
Xx on all of K. 

A sketch of the proof of the Weyl character formula is given in Section 7.6. 


7.4.3 Constructing the representations 


Recall that our goal is to show that every dominant integral element pu actually 
arises as the highest weight of some irreducible representation of K. To do 
this, we consider an arbitrary dominant integral element u, which, at the 
moment, we do not know to be the highest weight of any representation. 
However, whether or not u is the highest weight of some representation, it 
can be shown that the right-hand side of (7.23) defines a function on T that 
is invariant under the action of the Weyl group. Then, there exists a unique 
class function f, on K whose restriction to T is given by the right-hand side 
of (7.23). Using something called the Weyl integral formula (see Section 
7.6), it can be shown that the functions f,,, where u ranges over all dominant 
integral elements, are orthonormal. It is essential here that we can prove that 
all of the f,,’s are orthonormal by direct computation, without appealing to 
the Peter-Weyl theorem and without knowing that every yu is the highest 
weight of a representation. 

Let us take stock of the situation. We have the following results. First, the 
Peter—Wey] theorem tells us that the characters for the (equivalence classes of) 
irreducible representations form an orthonormal basis for the space of L? class 


214 7 Representations of Complex Semisimple Lie Algebras 


functions. Second, the Weyl character formula tells us that for an irreducible 
representation having highest weight u, the character of the representation 
is given by (7.23). Third, the Weyl integral formula tells us that if for every 
dominant integral element ju we define f, to be the unique class function whose 
restriction to T is given by (7.23), then the f,,’s are orthonormal. This holds 
even though we do not know at the moment that every dominant integral 
element is the highest weight of a representation. 

These results together imply that every dominant integral element is ac- 
tually the highest weight of some representation. To see this, note that the 
Peter-Weyl theorem and the Weyl character formula tell us that the set of 
f,.’s, where u ranges over all the highest weights of representations, form an 
orthonormal basis for L?(K,)*. On the other hand, the Weyl integral for- 
mula says that the set of f,,’s, where u ranges over all dominant integral 
elements, forms an orthonormal set. This second orthonormal set contains 
the first one, since the highest weight of an irreducible representation must 
be dominant integral. However, an orthonormal basis cannot be contained in 
a strictly larger orthonormal set—if it were, it would not be a basis. So, the 
only possibility is that the set of highest weights of irreducible representations 
is equal to the set of dominant integral elements, which is what we are trying 
to prove. 

To say the same thing a different way, suppose there were some dominant 
integral element u that was not the highest weight of any representation, and 
consider the function f,,. Since (we are assuming) p is not the highest weight 
of a representation, the set of characters is a certain set of f,’s, where o 
ranges over some subset of the dominant integral elements not including p. 
However, the Wey] integral formula tells us that f, is orthogonal to fo, for 
all o # u. This means that f,, is a nonzero class function that is is orthogonal 
to all characters, since the characters are all f,’s with o # u. This, however, 
is impossible: The Peter-Weyl theorem implies that any class function that 
is orthogonal to all the characters must be zero. So, 4 must, after all, be the 
highest weight of some representation. 

To see this argument spelled out in greater detail, see Brocker and tom 
Dieck (1985) or Simon (1996). The argument in those books is slightly more 
complicated than the one described here because those books consider arbi- 
trary connected compact groups, not necessarily simply connected. 

This “construction” of the representations of K is not very constructive; 
that is, we have proved that a representation with each dominant integral 
element exists, but we have not given a very explicit description of the rep- 
resentation. If one looks at the proof of the Peter-Weyl theorem, one will see 
that the representations are realized as certain finite-dimensional, translation- 
invariant spaces of functions on K, but it is not especially easy to see precisely 
which functions one gets. The Borel-Weil construction, described in the next 
section, gives a more explicit realization of the representations. Thus, the 
Borel—Weil construction is useful even if one has already proved that every 
dominant integral element is the highest weight of some representation. 
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7.4.4 Analytically integral versus algebraically integral elements 


One step of the argument in the previous subsection deserves elaboration. We 
asserted (see (7.22)) that if K is simply connected and p is a (real) integral 
element, then there exists a function f on T satisfying 


f (eF) = eH (7.24) 


for all H in t. Let us think about what is entailed in this statement. Since 
the Lie algebra t of the connected group T is commutative, T itself must also 
be commutative. It follows that the exponential map exp : t > T is a homo- 
morphism. The image of this homomorphism contains a neighborhood of the 
identity in T (by the local surjectivity of the exponential mapping for arbi- 
trary Lie groups). Since, also, T is connected, it follows that the exponential 
mapping for T is surjective. 
Now, let ® C t be the kernel of the exponential mapping; that is, 


= {H Eet] =I}. 


Since the exponential mapping for T is surjective, every t € T can be written 
as t = e# for some H in t. If Hy and Hy are in t and e# = e2, then (since t 
is commutative) e1~#2 = J and, so, Hı — He is in ®. This means that every 
t € T can be written as t = e¥, and the H is unique up to adding on an 
element of ®. So, now suppose we try to define a function f on T by defining 
(as in (7.24)) f(t) = expi(u, H}, where H is chosen so that e” = t. When is 
this well defined (i.e., independent of the choice of H)? Well, if H is such that 
e! = t, then any H’ with e”’ = t must be of the form H’ = H+4¢, with ¢ € ©. 
Therefore, we need that expi(u,H) = expi(u, H’) = expi(y, H) expi(p, >). 
This will hold precisely if (4, ) is an integer multiple of 27. We conclude, then, 
that the function in (7.24) is well defined precisely if u has the property that 
(u, $) is an integer multiple of 27 for all ¢ in the kernel of the exponential 
mapping. An element u of t having this property is called an analytically 
integral element. 

For reasons of consistency with Appendix E, it is convenient to introduce 
the set 

A= {H Et|e" =1}, 


so that the elements À of A are precisely those for which 27. € © (i.e., those 
A of the form À = ¢ġ/2r for some ¢ € ©). Saying that (u,¢) is an integer 
multiple of 27 is the same as saying that (u, A) is an integer. So, we have the 
following definition. 


Definition 7.29. An element u of t is called an analytically integral ele- 
ment if (u, à) is an integer for all A in A. 


The following result summarizes the conclusion of the previous paragraphs. 
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Proposition 7.30. For u € t, there exists a function f on T satisfying 
f (e) = etH H) 

if and only if u is an analytically integral element. 


Meanwhile, we have another notion of integral elements, namely that u € h 
is an integral element if (u, Ha) is an integer for each co-root Ha. To distin- 
guish this condition from the condition for an analytically integral element, 
we call ų an algebraically integral element if (u, Ha) is an integer for all 
co-roots. In the previous subsection, we asserted that (assuming K is sim- 
ply connected) f in (7.24) is well defined provided that u is an algebraically 
integral element. In light of Proposition 7.30, this amounts to asserting that 
every algebraically integral element is analytically integral. In fact, we have 
the following result. , 


Theorem 7.31. If K is simply connected, then the set of algebraically integral 
elements and the set of analytically integral elements are the same. 


This theorem is not at all obvious. It is not hard to show that every 
analytically integral element is algebraically integral; indeed, this is true even 
if K is not simply connected. Showing (in the simply-connected case) that 
every algebraically integral element is analytically integral is more involved. 
See Section E.4 for more information. 

Books such as Brécker and tom Dieck (1985) and Simon (1996) are con- 
cerned with the representations of compact Lie groups, which are not assumed 
to be simply connected. These books do not address the issue of whether every 
Lie algebra representation comes from a group representation. Thus, in those 
books, the only relevant notion of integral element is that of an analytically 
integral element and one never needs to worry about the relationship between 
algebraically integral and analytically integral element. The theorem of the 
highest weight, as presented in those books, says that every dominant and 
analytically integral element is the highest weight of a representation of K. 
This result holds whether K is simply connected or not. 

We, on the other hand, wish to connect the compact-group approach to 
the Lie algebra approach and, for this, it is necessary to know Theorem 7.31. 


7.4.5 The SU(2) case 


Let us see how the argument described in this section works out in the SU(2) 
case. We work with the maximal commutative subalgebra t of su(2) given by 


=f (#9 loen}. 


Note that t is the set of matrices of the form iaH, where, as usual, 
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mel), 


The associated maximal torus T C SU(2) is then 


T= {r= ¢ =) cer}. 


The Weyl group is the two-element group {J,—I} c O(t). 

Let us now compute the characters of the irreducible representations of 
SU(2), or, more precisely, the restriction to T of the characters. We know that 
in each irreducible representation ©, o(H) is diagonalizable with eigenvalues 
m,m — 2,...,-m. Thus, U(expiaH) = exp(iao(H)) is also diagonalizable 
with eigenvalues exp(ima), exp(i(m — 2)a), etc. The trace of D(expiaH) is 
then the sum of the eigenvalues: 


m 
Xm (eF) = trace(£ (e”)) = 5y g (m=2k)a, 
k=0 


This is a finite geometric series, which we will sum using a slight variation of 
the usual approach. We multiply y, by exp(ia) and then by exp(—ia), and 
subtract the results. When we do this, all but two terms in the geometric 
series cancel and we get 


(ci@ = a) Xm (eH) = eilmt a _ p~ilm+1)a (7.25) 
so that i(m+1) i(m+1) 
agy et — ete — sin((m + 1)a) 
Xm (e"* ) = eia — e-ia g sina ` (R 


The first equality in (7.26) is nothing but the Weyl character formula for the 
SU(2) case. Note that sin((m + 1)a) is zero at all points at which sina is zero 
(namely all integer multiples of 7) and so the expression for Xm is nonsingular, 
even at points where the denominator is zero. 

Meanwhile, suppose that f is any class function on SU(2) and let dA denote 
the normalized Haar measure on SU(2). The Weyl integral formula (Section 
7.6) in this case states that 


20 . da 
— iaH +2 
i f(A)dA= l f (e°) 2sin* a a" (7.27) 


Here, da/2r is the normalized Haar measure on T = { e| a € R}. 

We know from Chapter 4 that for each non-negative integer m, there is an 
irreducible representation with highest weight m. Let us pretend that we do 
not know this and see how the result follows from the Peter-Weyl theorem, 
the Weyl character formula (7.26), and the Weyl integral formula (7.27). For 
any non-negative integer m, whether or not we know that m is the highest 
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weight of a representation, (7.26) gives a well-defined function on T that is 
invariant under the map a —> —a. It is not hard to show, then, that there 
exists a unique class function fm on SU(2) whose restriction to T is given by 
(7.26). According to (7.27), we have for any distinct non-negative integers m 
and n, 


2) 2" sin((m+ 1)a) sin((n+1)a), . da _ 
a Ia IREA [ sina sina 2sin’ a 2m P, 
(7.28) 


because of the usual orthogonality of trigonometric functions on [0, 27] . 

Suppose, now, that there were some m that was not the highest weight of a 
representation. Then, the set of characters for SU(2) would consist of fn’s with 
n ranging over some subset of the non-negative integers not including m. This 
would mean, by (7.28), that fm is a nonzero class function that is orthogonal 
to all of the characters (since the characters are all f,,’s with n 4 m). However, 
the Peter-Weyl theorem says that the characters form an orthonormal basis 
for the set of L? class functions on SU(2), and thus a nonzero class function 
cannot be orthogonal to all of the characters. This, then, is a contradiction 
and m must be, after all, the highest weight of a representation. 


7.5 Constructing the Representations III: The 
Borel—Weil Construction 


The Borel—Weil construction, described in this section, is often described after 
the theorem of the highest weight has been proved (using, say, Verma modules 
to prove that every dominant integral element arises as the highest weight of an 
irreducible finite-dimensional representation). In such approaches, the Borel- 
Weil construction is simply an illuminating way to “realize” representations 
whose existence has already been demonstrated. However, it is also possible to 
use the Borel—Weil construction to prove the existence of the representations, 
and that is the approach we will follow here. 


7.5.1 The complex-group approach 


We have seen already two approaches to constructing the representations: the 
Lie algebra point of view, using Verma modules, and the compact group-point 
of view, using the Peter-Weyl theorem and the Weyl character formula. We 
now consider the complex-group point of view, using something called the 
Borel—Weil construction. (There is also a variant of the Borel-Weil construc- 
tion that uses algebraic groups instead of complex groups.) 

Before getting into the details of the Borel-Weil construction, we need to 
establish some notation and at the same time make sure that we understand 
the relationships between the representations of the various objects (Lie al- 
gebra, compact group, complex group). Let g be a complex semisimple Lie 
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algebra realized as a subalgebra of some M,,(C). Let È be a compact real form 
of g and let K and G be the connected Lie subgroups of GL(n;C) whose Lie 
algebras are € and g, respectively. It can be shown that both K and G are 
closed subgroups of GL(n; C) and, hence, matrix Lie groups. Let us assume for 
simplicity that K and G are both simply connected. (It turns out that G is 
simply connected if and only if K is—see the last section of Appendix E.) In 
this case, the representations of K (i.e., continuous homomorphisms of K into 
some GL(N;(C)) are in one-to-one correspondence with the representations of 
€, which, in turn, are in one-to-one correspondence with the complex-linear 
representations of g = tc. 

Now, since we assume that G is simply connected, every complex-linear 
representation of g can be exponentiated to give a representation of G. How- 
ever, not every representation of G arises in this way. After all, if II is a 
representation of G (continuous homomorphism of G into some GL(N;C)), 
there is no reason that the associated Lie algebra representation 7 should be 
complexz-linear. For example, consider the representation of SL(2;C) given by 
entrywise complex conjugation: 


ab ab 
n(2a)= (ea): 
(This is a representation (i.e., a continuous homomorphism of SL(2;C) into 
GL(2; C)) because the complex conjugate of the product of two matrices is the 
same as the product of the complex conjugates.) The associated representation 
m of sl(2;C) is given by the same formula as II, since the complex conjugate 
of exp(tX) is exp(tX), where X is the complex conjugate (entrywise) of X. 
Clearly, then, m is not complex-linear but rather conjugate-linear. 

We call a representation of G holomorphic if the associated representa- 
tion of g is complex-linear. Since g is a complex subalgebra of M,,(C), the 
group G is automatically a complex submanifold of GL(n;C) (Appendix C), 
and it can be shown that if II is holomorphic in the sense of the previous 
sentence, then II is a holomorphic mapping of the complex manifold G into 
GL(N;C). 

Assuming still that K and G are simply connected, we conclude that the 
following objects are in one-to-one correspondence with each other: 


continuous representations of K 
real-linear representations of £ 
complex-linear representations of g 


holomorphic representations of G. 


Here, “continuous” in the representations of K is to emphasize that we allow 
any continuous homomorphism of K into GL(N; C). There is one other class 
of objects that is frequently studied: the “algebraic” representations of G. It 
is not hard to see that every algebraic representation of G is holomorphic; 
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it is less obvious but still true that every holomorphic representation of G 
is also algebraic. So, actually, the algebraic representations of G are also in 
one-to-one correspondence with the above-listed objects. 


7.5.2 The setup 


We continue to assume that g C M,(C) is a complex semisimple Lie algebra 
with compact real form € and that G and K are the connected subgroups 
of GL(n;C) with Lie algebras g and €, respectively. We will assume that K 
is contained in U(n). (This is a harmless assumption since, by the averaging 
method of Section 4.10, there is an inner product on C” that is invariant 
under the action of K, and we can make a change of basis that converts this 
inner product into the usual one on C”.) Having made this assumption, we 
will have that each X € € will satisfy X* = —X, where X* is the usual matrix 
adjoint of X. It then follows that for any Z = Xı + iX2 € g, we have that 
Z* = —Xı + iX also belongs to g. From this it is not hard to show that for 
any A in G, A* is also in G. 

Now, suppose that 7 is a representation of g acting on some space V, and 
II is the associated representation of G. We can choose an inner product on 
V so that I(x) is unitary for all x in K. In that case, it is not hard to show 
that II satisfies 

TI(A)* = II(A*) 

for all A € G. For A € K, this means simply that II(A*) = H(A!) = 
TI(A)~* = II(A)*, since both A and II(A) are unitary. 

We now choose a maximal commutative subalgebra t in € and we let h = 
t + it be the associated Cartan subalgebra of g. We let R denote the set of 
roots for g relative to h, we choose a base A for R, and we let R* denote the 
set of positive roots with respect to A. Now, consider the following subspaces 


of g, 
nt = © ga: 
ae Rt 
n= © oo, 
ae R- 
bt = h a nt, 
b =bn. 


It is easily seen that each of these spaces is actually a subalgebra of g. It 
is also easily seen that nt is an ideal in bt and n~ is an ideal in b7 (i.e., 
[bt, nt] C n* and [b7, n7] C n7). Neither n+ nor n” is an ideal in g. From 
the proof of Proposition 6.19 we see that (ga)* = g-a and, thus, (6~)* = bt 
and (bt )" = b7. 

We now let Bt, B7, N+, and N~ be the connected Lie subgroups of G 
corresponding to b+, b7, nt, and n”, respectively. These are always closed 
subgroups of G and hence matrix Lie groups. We have 
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(Bt) =B, 
(B-)* = Bt. 


(This follows from the corresponding result for the Lie algebras bt and b7.) 
Now, suppose that u is any element of h and consider the linear map 
Xu : bt > C given by 


Xa(lH +X) = (u, H), Heb, X bt. (7.29) 


It is easily checked (using that n* is an ideal in bt) that x, is a Lie algebra 
homomorphism of b* into the one-dimensional commutative Lie algebra C. 
Now, 6* is the Lie algebra of the group B*, and we may think of C as the Lie 
algebra of the group C* of nonzero complex numbers, with the exponential 
mapping from C to C* being the usual exponential function. We may then 
ask whether or not there is an associated Lie group homomorphism 


X,: Bt > C. (7.30) 


Note that even though we are assuming that G is simply connected, Bt C G 
is not necessarily simply connected. Indeed, it turns out that Bt is never 
simply connected. 


Proposition 7.32. Let x, : bt — C be the Lie algebra homomorphism de- 
fined by (7.29). If G is simply connected, then an associated group homomor- 
phism X,„ : Bt > C* exists if and only if u € h is an integral element. 


The motivation for the definition of x, is that if uo is a highest weight 
vector for some representation 7 with highest weight u, then 


w(H + X)uo = (u, H) uo 


for all H € h and X € n*. (Note that 7(X)uo must be zero for X in nt 
since otherwise 7(X)uo would be a linear combination of weight vectors with 
weight higher than p.) 

To make Proposition 7.32 plausible, suppose that py is the highest weight 
of some representation 7 of g (in which case, p is certainly integral) and that 
uo is a weight vector with weight u. If G is simply connected, then there will 
be an associated representation II of G. Then, for all X € bt, we will have 


TI (e*) uo = e™ ug = e*u. (7.31) 


Now, any a € Bt will be a finite product of elements of the form e*, X € bt. 
It follows that for any a € B+, we will have II(a)up = f(a)uo for some nonzero 
complex number f(a). The map a —> f(a) will be a homomorphism of Bt into 
C*, and by (7.31), the associated Lie algebra map is x,,, so, in fact, f = X,. 
We conclude, then, that if is the highest weight of a representation 7 of 
g and G is simply connected, then X,, will exist. As a side benefit, we have 
shown that, in this case, the associated representation II of G satisfies 
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II(a)uo = X,,(a)uo (7.32) 


for alla € Bt. 

The argument in the preceding paragraph proves Proposition 7.32 in the 
case where u is the highest weight of some representation, in which case, 
L is certainly integral. However, of course, only dominant integral elements 
arise as highest weights and the proposition claims that X, exists for any 
integral u, dominant or not. Furthermore, even in the dominant integral case, 
we are going to use the proposition in proving that every dominant integral 
element arises as the highest weight of some representation, in which case, 
we are not allowed to assume the existence of the representations in proving 
the proposition. Nevertheless, it is the dominant case that we are mainly 
interested in, and the reason that we are interested in the homomorphism X, 
is because of (7.32). 

Let me now sketch briefly the proof of Proposition 7.32. It follows from 
the “polar decomposition” for G (see Section E.5) that G is simply connected 
if and only if K is simply connected. Consider the restriction of x, to the 
maximal commutative subalgebra t of £, which is the linear map H —> (p, H}. 
Let T be the subgroup of K whose Lie algebra is t. Since t C bt, it fol- 
lows that T C Bt. If G is simply connected (and so also K), then Corol- 
lary E.8 implies that there exists a homomorphism A, : T — C* such that 
A,,(exp H) = exp(, H) if and only u is an integral element. Now, if X, exists, 
then, certainly, A, must exist, since in that case, A, is simply the restriction 
of X, to T C Bt. Thus, if X, exists, then A, exists and, so (by Corollary 
E.8), y must be integral. To go in the other direction, recall that B+ is never 
simply connected, even when G and K are. (This is why not all of the homo- 
morphisms of b* into C give rise to homomorphisms of B+ into C*.) However, 
one can show that the fundamental group of Bt is isomorphic to that of T in 
a natural way. This implies that all of the difficulty in passing from b+ to B+ 
is already present in passing from t to T. So, if u is integral, Corollary E.8 
tells us that we can pass x, from t to T and, therefore, also from bt to BY. 


7.5.3 The strategy 


The idea of the Borel—Weil construction is to realize each representation as 
the action of G on a certain space of functions on G itself. In order to under- 
stand the strategy, let us first assume that we have a representation II of G 
and see what sort of functions on G we get from this. Then, the Borel—Weil 
construction will reverse this procedure: We will first construct a certain space 
of functions on G and then build the representation II from these functions. 

Thus, let (m, V) be a complex-linear representation of g and let II be the 
associated holomorphic representation of G. We continue to assume (as in 
the previous subsection) that II satisfies II(g*) = II(g)* for all g in G. Then, 
consider the matrix entries of II. These are the functions on G of the form 


Fug) = (u, H(g)v) , (7.33) 


7.5 Constructing the Representations III: The Borel-Weil Construction 223 


where u and v are elements of V. Because II is a holomorphic representation 
of G, these functions are holomorphic functions on G. For a fixed u € V, let 


F” = the space of all functions of the form Faw, vu EV. (7.34) 


For any fixed u, the map v + Faw is a linear map, which (by the definition 
of F”) sends V onto F”. If II is irreducible and u Æ 0, then it is not hard 
to show (Exercise 4) that the map v > Fa,» is injective. So, for u # 0, F” 
is a finite-dimensional vector space that is naturally isomorphic (as a vector 
space) to V itself, by the map v > Fy». 

Now, if F is any function on G and h is an element of G, define a new 
function Rp F by the formula 


(RiF)(g) = F(gh). 


The operator R, is the “right-translation by h” operator and it is a linear 
operator on the space of all functions on G. Furthermore, we compute that 


(Rn, Ra, F)(g) = (Rha F)(gh1) = F(ghiha) = (Raine F)(9)- 


So, the map h + Rp is a homomorphism and we may think of R as a repre- 
sentation of G, acting on the (infinite-dimensional) space of all functions on 
G. If we apply Rp to a matrix entry, then we have 


Rar Faw (9) = (u,M(gh)v) = (u, (g)H(h)v) = Fu nnyo(9).- (7.35) 


So, evidently, the right action of G leaves F” invariant and, thus, F” is a 
finite-dimensional representation of G. Furthermore, if u is nonzero and II is 
irreducible, then F” is isomorphic as a representation to V, since the map 
v — Faw is an intertwining map, by (7.35). 

The conclusion is this: We can realize any irreducible holomorphic rep- 
resentation of G as a space of holomorphic functions on G that is invariant 
‘under the right action of G. 

Let us look as well at the left action of G on functions. Consider the 
“left-translation by h” operator, defined as 


(LhF)(g) = F(hg). 


I leave it to the reader to check that Lhiha = Ln, Ln,. (For our purposes, it 
is not necessary that the map h > Ly, be a homomorphism. If, however, one 
wants a homomorphism, then one merely needs to replace F(hg) by F(h~1+g) 
in the definition of La.) We compute that 


Ln Fuv(g) a (u, I(hg)v) = (u, II(h)H(g)v) 
a (II(h)*u, II(g)v) = Fy, 
where u’ = II(h)*u. Thus, for typical h, La does not leave the space F” 


invariant, since the value of u is changed. Recall that we have defined II in 
such a way that II(h)* = II(h*), and so we can rewrite the above result as 
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Lr Fuv(g) = (H(h*)u, T(g)v) . (7.36) 


Let us consider the case that u = uo, where wo is a highest weight vector 
with some highest weight u. Suppose that we take h = b, where b is an element 
of B~, so that b* is in Bt. Recall the homomorphism X, : Bt — C* defined 
in (7.30) and recall also that (by (7.32)) 


I(a)uo = X,,(a)uo 
for all a € B+. Applying this with a = b* and substituting into (7.36) gives 


Fug v(bg) = Xb) Fin eG) (7.37) 


for all b € B- and g EG. 

What do we conclude from all this? Given any finite-dimensional holomor- 
phic representation II of G, we can realize II as a space F™? of holomorphic 
functions on G, where the space F“° is invariant under the right action of 
G and where each element F of ¥“° transforms under the left action of B7 
according to (7.37). Now, it turns out that every holomorphic function on G 
satisfying (7.37) is actually an element of F”°. (This is far from obvious at 
the moment.) 

The idea, then, of the Borel—Weil construction is this. We start with an 
integral element u and we want to show that u is actually the highest weight 
of some representation IT. We construct the homomorphism X, : Bt — C* 
described in Proposition 7.32 and we define a space F,,, as follows. 


Definition 7.33. If p is an integral element, let X„ : Bt > C* be the homo- 
morphism given by Proposition 7.32. Then, we define F,, to be the space of 
all holomorphic functions on G satisfying 


F(bg) = Xy(0")F(g) 
for allb € B- andallg eG. 


Recall that if b is in B~, then b* is in Bt. Although the definition of 
F,, makes sense for any integral element, we are interested primarily in the 
case in which u is dominant integral. (It turns out that if u is integral but 
not dominant, then F, contains only the zero function.) In the case that p 
is dominant integral, we want to establish the following results. (1) F, is 
invariant under the right action of G. (2) F,, if finite dimensional. (3) F, 
contains some nonzero elements. (4) F,,, under the right action of G, is an 
irreducible holomorphic representation such that the associated Lie algebra 
representation has highest weight u. This will show that the dominant integral 
element p is indeed the highest weight of some irreducible representation and 
will give a “concrete” realization of that representation. 

If we can prove all of this and we let uo be a highest weight vector inside 
F,,, then we will have 


7.5 Constructing the Representations III: The Borel-Weil Construction 225 


Fy; 
where F"? is the space defined in (7.34). Note, however, that we are not 
allowed to define F,, to be equal to F“° since we do not know at the beginning 
that there is any representation II with highest weight u. So, instead, we 
define F, to be the space of functions having the properties that we know the 
elements of F”? should have. 

The hardest part of the Borel—Weil construction is to prove that the space 
F, is nonzero (i.e., that there exists a nonzero holomorphic function F satis- 
fying F(bg) = X,,,(b*)F(g), b € B7, g € G). Note that if we knew that u was 
the highest weight of some irreducible representation, then functions of the 
form (7.33) would be in F,, and, thus, F, would be nonzero. However, we are 
trying to use the Borel—-Weil construction to prove that such a representation 
exists, and so we are not allowed to assume this. 

Since G is a complex group and B7 is a complex subgroup, the quotient 
manifold G/B7 is a complex manifold. (The group B7 is not a normal sub- 
group of G and, therefore, the quotient G/B~ is not a group but only a 
manifold.) The space F, should really be thought of as the space of holomor- 
phic sections of a certain holomorphic line bundle over G/B~. This point of 
view allows a large amount of differential geometric machinery to be brought 
to bear, especially cohomology and the Riemann—Roch formula. For exam- 
ple, it can be shown that G/B™ is a compact complex manifold. (Specifically, 
G/B- is identifiable with K/T, where T is the connected subgroup of K with 
Lie algebra t.) This implies that the space F,, of holomorphic sections is fi- 
nite dimensional. This machinery is beyond the scope of this book and is not 
necessary if all one wants is to prove the existence of a representation with 
a given dominant integral element. For more information on the differential 
geometric side of things see Pressley and Segal (1986), Duistermaat and Kolk 
(2000), Knapp (1986), Knapp (1988), and the article by Eastwood and Sawon 
in Bridson and Salamon (2002). 


7.5.4 The construction 


We continue with the notation established in Subsection 7.5.2. We consider 
an integral element u € b, which, at the moment, we do not assume to be 
dominant. We let X,,: Bt — C* be the homomorphism given by Proposition 
7.32 and we consider the space F, of functions on G defined in Definition 
7.33. We begin with one important but easy result. 


Proposition 7.34. For any integral element u, the space F, is invariant un- 
der the right action of G. 


Proof. Suppose that F is an element of F, and h is an element of G. Then, 
for all g € G and b € B7, we have (by the associativity of the product on G) 


(RiP) (bg) = F(bgh) = X, (0*)F (gh) = Xy(b*)(RaF) (9): 
This shows that RF is again in F,,. O 
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We are now ready to state the main result of this section. 


Theorem 7.35. Assume G is simply connected. Let u be an integral element, 
let X,, : Bt + C* be as in Proposition 7.32, and let F, be as in Definition 
7.33. Then, the following results hold: 


1. If u is not dominant, then F, contains only the zero function. 

2. If is dominant integral, then F,, is nonzero but finite dimensional. In 
this case F,,, forms an irreducible holomorphic representation under the 
right action of G, and this representation has highest weight n. 


This construction of the representations of G is called the Borel-Weil con- 
struction. 

Let us elaborate slightly on the meaning of Point 2 of the theorem. The 
space F, is invariant under Ra (h € G) and is finite dimensional. It can 
be shown that the map h > Ral, is a continuous map of G into GL(F,.) 
and, thus, F, constitutes a finite-dimensional representation of G. Point 2 
of the theorem asserts that this representation is holomorphic, meaning that 
the associated representation of g is complex-linear. The statement that “this 
representation has highest weight u” then means more precisely that the as- 
sociated complex-linear representation of g has highest weight p. 

It is not feasible for me to give a complete proof of this theorem here. I 
outline the intermediate steps needed and prove the ones that can be proved 
easily. The main omission is that I do not prove that F, is nonzero in the 
dominant integral case. (Recall that if one already knows that a dominant 
integral element u is the highest weight of a finite-dimensional representation, 
then it is easy to show, as in the previous subsection, that F,, is nonzero. 
However, we are trying to use the Borel—Weil construction to prove the ex- 
istence of such a representation; to do so we must prove directly that F, is 
nonzero. ) 


Lemma 7.36. If F, is nonzero and finite dimensional, then F, forms an 
irreducible holomorphic representation under the right action of G and this 
representation has highest weight u. 


We know that in any finite-dimensional irreducible representation, the 
highest weight must be dominant integral. The proposition thus implies that 
if F, is finite dimensional and nonzero, then u must be dominant. 


Proof. Assume that F,, is nonzero and finite dimensional. Then, F, forms a 
representation under the right action of G. Because the elements of F, are 
holomorphic and the right action of G on itself is holomorphic, it can be 
shown that F, is a holomorphic representation of G and, thus, there is an 
associated complex-linear action of g on F,,. By complete reducibility of the 
representations of g, F, decomposes as a direct sum of irreducible g-invariant 
subspaces. Each of these subspaces contains a nonzero highest weight vector 
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F, with some highest weight o. Applying (7.32) to the representation IT given 
by II(h) = Ralz, , we obtain that 


RaFo = Xo (a)Fo 


for all a € Bt. Meanwhile, since F, is an element of F,,, we have that 


LiFo = X (b*) Fo 


for all b € B~. Thus, F, satisfies 


F(bga) = X,(8°)Xo(a)F(g) (7.38) 


for all g € G, a € B®, and b € B7. 
If we take g = I in (7.38), we obtain 


F, (ba) = X,,(b")Xo(a)F5 (I). (7.39) 


Now, every element of g can be written as the sum of an element of b+ and 
an element of b~ (nonuniquely since bt N 6~ = b). It follows (using the 
Inverse Function Theorem as in the proof of Theorem 2.27) that there is a 
neighborhood U of I in G with the property that every element g of U can 
be written (nonuniquely) as g = ba with a € Bt and b € B~. This, together 
with (7.39), tells us that if F,(I) were equal to zero, then F, would be zero 
on U and, hence (since F is holomorphic), that F, would be identically zero. 
Since we have chosen F, to be nonzero, we conclude that F(T) 4 0. We may, 
therefore, normalize F so that F,(I) = 1 and we obtain that 


F,(ba) = X,,(6*)X,(a). (7.40) 


Consider the connected subgroup T of K whose Lie algebra is t. Then, T 
is contained in both B* and B~. This means that 


F,(t) = X Œ) = Xo (t). (7.41) 


However, since T C K C U(n) we have that t* = t7! for all t € T. Further- 
more, we know that (u, H) is imaginary for all H € t. This implies that X,,(t) 
has absolute value 1 for all t € T and, thus, that X,,(t) = X,(t)~' for all 
t € T. So, we see that 


Ail = X tT = eal) 
for all t € T. Thus, (7.41) becomes 
X(t) = X,(t) 


for all t € T. This can occur only if o = p. 
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Thus, the only highest weight that can occur in F, is u. Suppose now 
that F,, is reducible, so that its decomposition into irreducibles contains at 
least, two terms, each of which (we have seen) must have highest weight u. 
Let a and F? be the highest weight vectors of these two subspaces. We 
may normalize each of them to be equal to 1 at the identity, and then both 
satisfy (7.40). This means that F F is equal to F z in a neighborhood of the 
identity and, thus, everywhere since the functions are holomorphic. So, the 
subspaces associated to Ff and F? are equal after all and we conclude that 
F,, is irreducible. Oo 


Lemma 7.37. For all integral elements u, F, is finite dimensional. 


I will not attempt to prove this result here. The usual argument is to 
regard F, as a space of holomorphic sections of a complex line bundle over the 
manifold G/B~. Then, one shows that G/B7 is a compact complex manifold 
(by identifying G/B~ with K/T) and one makes use of a standard result from 
complex geometry that the space of holomorphic sections of a vector bundle 
over a compact complex manifold is finite dimensional. 

Once Lemma 7.37 is established, it remains only to show that in the dom- 
inant integral case, the space F, is nonzero. This is the most difficult step in 
the argument. Our strategy is to build an element F, in F, that will be our 
highest weight vector. According to (7.40) (with o = u), F, is determined 
on elements g of G that are of the form g = ba, with a € Bt and b € B7. 
Because Bt and B~ have a nontrivial intersection, it is convenient to look 
at elements of the form na, with a € Bt and n € NT C B`. Note that 
X,, is identically equal to 1 on NY, because (by definition) x, is zero on n”. 
However, for n € NT, we have that n* € N+. Thus, F, should satisfy 


F,(na) = X,(a), a€Bt,neNn-. 


Note that as a vector space, g decomposes as g = bt @n~. It can be shown 
that Bt and N- intersect only at the identity. It follows that each g € G can 
be decomposed as g = na, with a € B* and n € N7, in at most one way. 
Thus, it makes sense to define a function on the set N~ Bt C G by defining 
how the function acts on the pair (a,n). Unfortunately, not every element of 
G is a product of an element of B+ and an element of NT; that is, N~ Bt 
is a proper subset of G. So, we first define F, on N- B* and then we must 
show that F extends to a holomorphic function on G. 


Lemma 7.38. If p is dominant integral, then the function F, on N~Bt CG 
given by 

F(na)=X,(a), a€Bt,neNn, 
has a unique holomorphic extension to all of G. The resulting holomorphic 
function on G is an element of Fq. 


I will give only a sketchy outline of the proof, taken from Jantzen (1987). 
(Since Jantzen works in the setting of algebraic group schemes, it is necessary 
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to translate some of the arguments into the language of complex Lie groups.) 
For each root a let Ag be the element of K C G given by (6.21) in Section 
6.6. (For each a, Ag is an element of N(t) that represents the Weyl group 
element wa.) The calculations on pp. 201-202 of Jantzen (1987) then show 
how to extend F, holomorphically to the set of elements of the form 


Aana (7.42) 


with a € Bt and n € N~. One then has F, defined holomorphically on the 
set E consisting of N~ B* together with the elements of the form (7.42). One 
then argues that the complement of E in G has complex codimension 2, and so 
a standard result in complex analysis shows that F, extends holomorphically 
to all of G. 

Putting Propositions 7.36, 7.37, and 7.38 together gives Theorem 7.35. 


7.5.5 The SL(2;C) case 


Let us see how the Borel-Weil construction works out in the case G = SL(2; C). 
In particular, we will show very explicitly in this case that the space F, defined 
in Definition 7.33 is nonzero whenever p is dominant integral. If X, Y, and 
H denote the usual basis elements for sl(2; C), then we may take h to be the 
span of the element H, bt to be the span of the elements X and H, and n` 
to be the span of the element Y. In that case, the corresponding connected 
subgroups of SL(2; C) are 


B= { (54) acct pech, 
f(D) tg} 


An integral element in this setting is simply an integer and a dominant integral 
element is a non-negative integer. For any integer m, the homomorphism 
Xm : Bt — C* is given by 


Xn( 4) =a". (7.43) 


Qa 


Let us now see which elements of SL(2;C) are contained in N~ Bt. We 


compute 
10\ (ap a B 
(51) (02) = (cara): as 


=) (7.45) 


with determinant one. If z 4 0, then we can express g uniquely in the form 
of (7.44) if we take a = z, B = y, and 6 = z/z. (It is easy to check that with 


Consider a matrix 
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this choice of a, 3, and ô, the (1,1), (1,2), and (2,1) entries in (7.44) agree 
with the corresponding entries of g. The condition that g have determinant 
one then guarantees that the (2,2) entry in (7.44) also agrees with the (2, 2) 
entry in g.) If x = 0, then g cannot be decomposed into the form (7.44), since 
the (1,1) entry on the right-hand side in (7.44) cannot be zero. So, the set 
N~ B* inside SL(2; C) is precisely the set of matrices in SL(2;C) whose (1, 1) 
entry is nonzero. 

This calculation (specifically that a = x) together with (7.43) shows us 
that the function Fm in Lemma 7.38 satisfies 


on the set of matrices in SL(2;C) with z 4 0. When, then, does the function 
Fm extend to a holomorphic function on all of SL(2;C)? Clearly, it extends 
precisely when m > 0. Is the resulting function Fm (m > 0) an element of the 
space F, defined in Definition 7.33? Let us compute and see. The elements of 


B- are those of the form 
b= (Go). (7.46) 


Now, if b is as in (7.46) and g as in (7.45), then the (1,1) entry in bg is az. 
So, Fm(g) = 2™ and Fm(bg) =a™x™. Then, we compute that 


Xb") = Xm e 2) =a". 


a 


Thus, indeed, Fn (bg) = Xm(b*)F(g) and Fm is a (nonzero!) element of F,,. 
It is straightforward to extend this calculation to the case of SL(n; C). See 
Exercise 6. 


7.6 Further Results 


Although the main result about the representations of a complex semisimple 
Lie algebra is the theorem of the highest weight, there are many other useful 
results about the representations. This section presents some of these results, 
mostly without proofs. We continue to assume the notation established at the 
beginning of this chapter. 


7.6.1 Duality 


Recall from Section 4.7 the notion of the dual representation 7* associated to 
a finite-dimensional representation 7 of a group or Lie algebra. We know in 
general that 7* is irreducible if and only if 7 is irreducible and that (7*)* is 
equivalent (as a representation) to 7. In the case of representations of semisim- 


ple Lie algebras, we have the following result. (Compare Exercise 2 in Chapter 
5.) 
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Proposition 7.39. If is an irreducible finite-dimensional representation of 
g, then the weights of x* are the negatives of the weights of x. Specifically, if 
L is a weight of t then —p is a weight of x* with the same multiplicity as u. 


Proof. Let V be the space on which 7 acts, let u be a weight of m, and let 
V,, be the corresponding weight space. We know that V is the direct sum of 
V,, and the weight spaces for the other weights o of 7. Now let ¢ be a linear 
functional on V,,, and extend ¢ to a linear functional on all of V by setting 
@ to zero on all weight spaces Vz, o Æ u. Let us compute how the operators 
n*(H), H € b, act on ¢. If v € V, then we have, by the definition of the dual 
representation, 


[x*(H)¢](v) = [-1(H)" 4] (v) = -or (H w) = — (u, H) o(v). 


Ifv € Vo, with o Æ u, then both ¢(v) and ¢(a(H)v) are equal to zero, and so 
we still have 

[w*(H)¢](v) = — (u, H) o(v) (7.47) 
(both sides equal to zero). Thus, actually, (7.47) holds for all v € V and, 
therefore, 7*(H)¢ = — (u, H) ¢. This shows that —y is a weight of m*, and it 
is easily seen that the multiplicity of —u in 7* is the same as the multiplicity 
of win a. o 


Now, we have classified representations of complex semisimple Lie algebras 
by their highest weights. So it is reasonable to ask how the highest weight of 
m* is related to the highest weight of m. The answer is provided in the following 
result. 


Theorem 7.40. There exists a unique element wo of W such that for each 
dominant integral weight u, wo - (—u) is, again, dominant integral. If m is 
an irreducible representation with highest weight u, then n* is an irreducible 
representation with highest weight wo - (— u). 


The proof of this result uses standard properties of the Weyl group (Section 
8.7) and is omitted. 

If it happens that —J is an element of the Weyl group (which is the case 
for some semisimple Lie algebras but not for others), then we have wo = —I. 
This holds, for example, for g = so(5;C), whose root system is “Bo.” See 
Section 8.5. In such cases, wo - (—u) = p and, so, the highest weight of 1* is, 
again, u. Thus, in Lie algebras where —I is an element of the Weyl group, 
every irreducible representation is equivalent to its dual. 

For sl(3;C), —J is not an element of the Weyl group. If we take the usual 
base A = {a1, 2} for the root system of sI(3;C), then wo will be the reflection 
about the line perpendicular to the root a3 = a, + a2. (Compare Figures 5.2 
and 5.3.) If 1 and jig are the fundamental weights for sI(3; C) (circled in Figure 
5.2), then every dominant integral element is of the form pp = mıı + Mofo. 
Then, wo - (=u) = maf. + mp2. (Compare Exercise 3 in Chapter 5.) In the 
case of sl(3;C), a representation with highest weight u = mip + Mp2 is 
equivalent to its dual if and only if mı = mo. 
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7.6.2 The weights and their multiplicities 


It is important to know not only the highest weight of a representation, but 
all of the weights, along with the multiplicities of the weights. We begin by 
looking at which weights occur and then turn to the multiplicities. For pictures 
of the weights of various representations, see Section 5.7, 8.5, and 8.6. Recall 
from Section 5.7 that the convez hull of a finite collection of points in a vector 
space is the smallest convex set containing those points. 


Theorem 7.41. Suppose that V is a representation with highest weight mo. 
Then, an element u € h is a weight of V if and only if the following two 
conditions are satisfied: 


1. u is contained in the convex hull of the orbit of po under the Weyl group. 
2. ug — u can be expressed as a linear combination of the positive simple roots 
with integer coefficients. 


Note that Condition 2 implies that p is an integral element, since uo and 
all of the roots are integral. On the other hand, there will typically be integral 
elements u contained in the convex hull of the W-orbit of po that are not 
weights of V. After all, if u is an integral element, then pup — p is also an integral 
element, but this does not necessarily mean that uo — u can be expressed as 
an integer linear combination of roots. See Section 8.10 for more on this issue. 

As discussed after the statement of Theorem 7.15, every integral element 
occurs as a weight of some irreducible representation of g. In fact, given an 
integral element p, it follows from Theorem 7.41 and standard results about 
the Weyl group (Section 8.7) that there exist infinitely many inequivalent 
irreducible representations of g for which p is a weight. See Exercises 8 and 9 
in Chapter 8. 

The proof of Theorem 7.41 is the same as the proof in the sl(3;C) case, 
which is sketched in Section 5.7. 

We now turn to the matter of the multiplicities. In the sl(3;C) case, there 
is a simple pattern to the multiplicities, described in Section 5.7. In the gen- 
eral case, things are more complicated and there are two standard results 
about the multiplicities: Freudenthal’s formula and Kostant’s formula, both 
of which allow one, in principle, to compute all of the multiplicities in any 
representations. Although carrying out these computations in a particular 
case can be arduous, there exist computer programs that can do them. I 
present here (without proof) Kostant’s formula, which gives the multiplicity 
of each weight directly. Freudenthal’s formula gives a recursive algorithm for 
computing the multiplicities that may be more computationally efficient than 
Kostant’s formula in high-rank examples. Freudenthal’s formula is in Section 
22 of Humphreys (1972) and Kostant’s formula is in Section 24. 

Suppose now that À is an element of the root lattice (i.e., that A can be 
expressed as a linear combination of roots with integer coefficients). Now, let 
p(A) denote the number of ways that À can be expressed as a linear combina- 
tion of positive roots with non-negative integer coefficients. If A is higher than 
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zero, then there will be at least one such way; if is not higher than zero, 
then there will be no such way and p(A) = 0. Consider, for example, the case 
of sl(3;C), in which we have three positive roots: a1, a2, and a3 = Q1 + Q2. 
If à = 2a, + 3a2, then we have 


A = 2a; + 8a2 = a, + 2a2 + Q3 = a2 + 203 


and, so, p(A) = 3. More generally, every element of the root lattice for sl(3; C) 
can be expressed as A = ka, + lag for a unique pair of integers k and l. If 
either k or l is negative, then p(\) = 0. If both k and l are non-negative, then 
p(A) = 1 + min(k, l). 


Theorem 7.42 (Kostant). Suppose that V is a finite-dimensional irre- 
ducible representation of a complex semisimple Lie algebra g with highest 
weight uo. If u is a weight of V, then the multiplicity my, (u) is given by 


Mulu) = D> det(w) p(w: (uo + ô) — (u + ô)), 
wEW 


where ô is half the sum of the positive roots. 


Let us consider, first, the w = I term in the sum, namely 


P((uo + ô) — (u + ô)) = p(o — n). 


Since uo is higher than (or equal to) u, uo — py is higher than (or equal to) zero, 
and so this term in the sum is nonzero. If the highest weight fo is sufficiently 
far from the walls of the fundamental Weyl chamber and p is sufficiently close 
to po, then it will happen that for all w 4 J, w - (uo + 6) is not higher than 
b+. So, p(w- (uo +46) — (4+4)) will be zero for all w 4 I, and in such cases, 
Myo (H) = pluo — u). (In the Verma module V,,,, described in Section 7.3, all 
weights u have multiplicity p(o — u).-) 

As an example, consider the representation of sl(3;C) with highest weight 
(0,3) and consider the weight u = 0 occurring in this representation. Figure 
7.1 illustrates the computations needed to apply Kostant’s formula in this 
case. In the figure, black dots indicate the weights of the representations and 
the vertices of the outer hexagon indicate points of the form w- (uo +ô), where 
Ho is the highest weight, (0,3). Of the six elements of the form w - (uo + ô), 
only two are higher than u + ô, namely pio + 6 and we, - (uo + 6). Since 
det(wa,) = —1, we obtain 


Myo (H) = plai + 2a2) — p(2az) 
=P Le A. 


In this representation, all weights have multiplicity one. 

In the case of sl(3;C), it is possible to use Kostant’s or Freudenthal’s 
formula to show that the multiplicities follow the simple pattern described in 
Section 5.7. For other Lie algebras, the pattern of multiplicities can be more 
intricate. 
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7.6.3 The Weyl character formula and the Weyl dimension formula 


Let K be a simply-connected compact Lie group, let t be its Lie algebra, 
and let g = tc, so that g is a complex semisimple Lie algebra. Then, there 
is a one-to-one correspondence between the continuous representations of K 
and the complex-linear representations of g. Let t be a maximal commutative 
subalgebra of £ and let T be the connected subgroup of K whose Lie algebra 
is t. Recall from Section 7.4 that the character of a representation II of K is 
the function xq on K defined by 


xn(x) = trace(II(z)). 


Recall also the Wey] character formula (Theorem 7.28) which gives an expres- 
sion for the restriction to T of the character of an irreducible representation 
with highest weight u. (The statement of that theorem is in terms of the real 
weights, which are 1/i times the ordinary weights.) 
Note that the value of the character at the identity is equal to the dimen- 
sion of II: 
xu(I) = trace(I) = dim II. 


This means that the dimension of the representation can be obtained by eval- 
uating the character at the identity. Unfortunately, this procedure is not as 
simple as it sounds, since the Weyl character formula can be taken literally 
only at points where the denominator is nonzero. At the identity, the de- 
nominator of the Weyl character formula is zero and the whole formula is 
of the “zero over zero” form. Nevertheless, it is possible to use a version of 
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L’Ho6pital’s rule to evaluate the character at the origin so as to obtain the 
following result. 


Theorem 7.43. Suppose that r is an irreducible representation of g with high- 
est weight u. Then, the dimension of m is given by 
Toer+ (a, ut ô) 

Teens (a, ô) i 


where R* denotes the set of positive roots and 6 is half the sum of the positive 
roots. 


dim 7 = 


Note that the denominator of this formula is a constant, independent of 
u. So, the dimension is a polynomial function of the highest weight, and the 
degree of this polynomial is equal to the number of positive roots. The dimen- 
sion formulas for the cases of sl(2;C) and sl(3;C), which we have discussed 
previously, are consistent with this. For sl(2;C), the dimension is m + 1, and 
for sl(3;C), it is (mı + 1)(m2 + 1)(mı + m2 + 2), reflecting that for sl(2;C) 
there is one positive root and for sl(3;C) there are three. 

Let us now verify that Theorem 7.43 agrees with the results we have stated 
earlier for the sl(2;C) and sl(3; C) cases. For sI(2;C), we think of the roots and 
weights as being simply numbers (the eigenvalue of H), and the inner product 
as simply the product. The one positive root a is then the number 2 (reflecting 
that ady(X) = 2X), and half the sum of the positive roots is the number 1. So, 
if m is the highest eigenvalue of H occurring in an irreducible representation, 
the dimension predicted by Theorem 7.43 is (m + 1)/1, in agreement with 
Chapter 4. 

For the case of sl(3;C), we note that 


= ae (a1, H) 
mı = u(Hı) = OEN 


Pa _ (az, u) 
M2 = (H2) = a EEN 


Let us normalize all the roots so that (a, a) = 2 (as in Section 5.6). Using that 
normalization of the inner product, we have m; = (a1, p) and m2 = (a2, H). 
Letting ag = a; + a2, we have 


ô = =(a, +a2 + Q3) = Q3. 


Nle 


We then note that (a1, 6) = 1, (a2, 6) = 1, and (a3,6) = 2. So, the numerator 
in the dimension formula is 


(lon, u) + (an,6)) (faz, u) + (a2,6)) (las, u) + (as, 8) 
= (mı + 1)(m2 + 1)(mı + m2 + 2) 


236 7 Representations of Complex Semisimple Lie Algebras 


and the denominator is (1)(1)(2). Thus, the dimension formula in this case 
becomes 


1 
dima = 5(™m +1)(m + 1)(m, + m2 + 2), 


as stated in Chapter 5. 


7.6.4 The analytical proof of the Weyl character formula 


Suppose that K is a simply-connected compact Lie group with Lie algebra 
£, so that g := łc is a complex semisimple Lie algebra. We fix a maximal 
commutative subalgebra t of & and we let T be the connected Lie subgroup 
of K whose Lie algebra is t. It can be shown that T is a torus; that is, T is 
isomorphic to S! x S$! x --. x S1. Furthermore, it can be shown that every 
element of K is conjugate to an element of T; that is, given A € K, there 
exists B € K such that BAB~! € T. (See Chapter IV of Brocker and tom 
Dieck (1985).) 

As in Section 7.4, we work with the real roots, which are the elements a 
of t such that there exists a nonzero element X of g with 


[H, X] = ila, H)X 


for all H in t (and, therefore, for all H in b). We consider also the integral 
real elements, which are those elements p of t such that 2(a, u) / (a, œ} is an 
integer for each real root œ. As discussed in Section 7.4, for each integral real 
element p, there is a function f, on T such that 


fale”) = etH) (7.48) 


for all H in t. Functions of this form are called torus characters; they are in 
fact the characters of the irreducible representations of T, which are necessarily 
one-dimensional since T is commutative. 


Proposition 7.44. The set of torus characters form an orthonormal set in 
L? (T, dt), where dt is the normalized Haar measure on T. 


Actually, the torus characters form an orthonormal basis, but the com- 
pleteness is not relevant here. This result can be thought of as the Peter-Weyl 
theorem for T (since on the commutative group T every function is a class 
function), but it can also be proved directly. See Exercise 2. 

We now let 6 denote half the sum of the positive real roots: 


aes 5 a. 
spar 


Note that the sum is over all positive roots, not just the positive simple roots. 
It can be shown that 6 is an integral element (Section 8.7). 
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Definition 7.45. The Weyl denominator is the function o : T > C given 
by 
o(e#) = y det(w) evo), 
wEW 


This is a well-defined function on T because 6, and therefore also w - 6 for 
all w € W, is an integral element. This is (as the name suggests) the function 
that occurs in the denominator of the Weyl character formula. 

The next key ingredient is the Weyl integral formula. I state there just 
the special case of the formula that applies to class functions; there is a more 
general version of the formula that applies to arbitrary functions on K. 


Theorem 7.46. Let f be a continuous class function on K, let dA denote 
the normalized Haar measure on K, and let dt denote the normalized Haar 
measure on T. Then, 


ee 
[ Aaa = a f FOO ae 


Here, |W| denotes the order of the Weyl group (i.e., the number of 
elements in W). 

We are not going to prove this formula here but will only sketch the ap- 
proach one uses. We consider the map ®: T x K/T > K given by 


(t, A) = AtA ™!, tET, AEK. 


We have written ® as a map from T x K into K, but if we replace A by At’ for 
some t’ in T, then (since T is commutative) the value of ® does not change. 
Thus, we may think of @ as mapping T x K/T into K; in fact, it maps onto 
K, because every point of K is conjugate to a point in T. We want to use 
the change-of-variables formula to change the integral over K into an integral 
over T x K/T. 

In the case of a class function, f(AtA7!) is independent of A, so the 
integration over K/T drops out and we are left with just an integral over T. 
The factor of |c(t)|* is essentially the Jacobian of the map ®, which enters 
as a consequence of the change-of-variables theorem. The factor of |W | arises 
because an element B of K can be conjugate to several different elements of 
T. In fact, if A is conjugate to a point t in T, then A is also conjugate to any 
point in T of the form w -t, for w in the Weyl group. (This is true because 
if B is in M(t), then Adg preserves t and thus also T.) “Generic” elements of 
K are conjugate to exactly |W| elements of T and the map © is “generically” 
|W|-to-one, and this is why we need to divide by |W| in the Wey] integral 
formula. To say the same thing another way, each “generic” conjugacy class 
intersects T in exactly |W] points and, so, when integrating over T, we are 
“overcounting” the conjugacy classes by a factor of |W|. We then need to 
divide by |W| to compensate for this. 
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There are two important features of the Weyl integral formula that we 
will use in the proof of the Weyl character formula. The first is the factor of 
|W| which we discussed in the previous paragraph. This factor is extremely 
important. It is instructive to check the normalization of the Weyl integral 
formula by checking the formula in the case f = 1. Then, the left-hand side 
of the integral formula is equal to 1, and the right-hand side is equal to 
|W| /|W| since the terms in the Weyl denominator are orthonormal elements 
of L?(T,dt). (The points w - 6, w € W, are distinct integral elements.) The 
second important feature is that the Jacobian factor |o(t)|? is the absolute 
value squared of a very nice function o. Of course, a Jacobian is always non- 
negative and, thus, has a non-negative square root. This non-negative square 
root, however, may not be a nice function—for example, it may not be smooth. 
(For example, the non-negative square root of the function «? is |x|, which 
is not differentiable at the origin.) Here, the function o (which is not non- 
negative) is a very nice function, not only smooth but also having a very 
simple expansion in terms of torus characters. 

We now turn to the proof of the Weyl character formula. The main ingre- 
dients are (1) the factor of |W| in the Weyl integral formula, (2) the precise 
form of the function ø, and (3) the Peter-Weyl theorem. We do not need 
the full power of the Peter-Weyl theorem here. We need only that the char- 
acters of the irreducible representations of K have L? norm 1 with respect 
to the normalized Haar measure on K. This follows from fairly elementary 
“orthogonality relations.” (See Section II.4 of Brécker and tom Dieck (1985).) 

We now start thinking about characters of representations of K. 


Proposition 7.47. If II is a representation of K, then the restriction of the 
character yy to T satisfies 


xn(e4) = Yope PHL 
H 


where the sum is over all the real weights of TI and where m, is the multiplicity 
of the weight u. 


Proof. The space V on which II acts is the direct sum of the weight spaces 
V, associated to the real weights u. On each V,,, we have (by definition) that 
n(H) = i(u, H)I. Thus, I(exp H) = (expi (u, H)) I on V,,. Taking the trace 
of this equality gives the proposition. o 


This is a “formula” of sorts for the character, but rather cumbersome 
because the number of weights involved gets larger and larger as the high- 
est weight gets larger, and also because computing the multiplicities can be 
complicated. For example, trying to compute the dimension of II from this 
formula is not so easy—one would have to sum up all the multiplicities of all 
the weights involved. By contrast, the Weyl character formula yields (with a 
bit of effort) the simple dimension formula given in Theorem 7.43. 
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We now turn to the proof of the Weyl character formula, as stated in 
Section 7.4. 


Proof. If the representations II has highest weight u, then the character for- 
mula is equivalent to saying that 


(x: det (w) ates) xn (e”) = 5 det (w) et (+8) H) (7.49) 


wEW 


Now, the Weyl denominator (the sum on the left in (7.49)) is a sum of torus 
characters with coefficients equal to +1 and with |W] terms. The character 
xu itself (restricted to T) is sum of torus characters with non-negative integer 
coefficients and with the number of terms getting larger and larger as the 
highest weight gets larger and larger. When we multiply the Weyl denominator 
and the character, we will get an apparently huge sum of torus characters with 
integer coefficients. Note that the product of two torus characters (as in (7.48)) 
with weights u and pz is another torus character, with weight p41 + u2. 

The Wey] character formula says when we multiply the character and the 
Weyl denominator, all but |W | terms in this huge sum must cancel out. (This 
can be seen very explicitly in the SU(2) case. See (7.25) in Section 7.4.) Note 
that the possibility of cancellation arises because of the minus signs in the 
formula for the Weyl denominator and because the same torus character can 
arise in several different ways in the sum. 

How do we know this cancellation actually occurs? Well, the Peter-Wey] 
theorem tells us that the character has L? norm 1 over K with respect to 
normalized Haar measure on K. Then, the Weyl integral formula (Theorem 
7.46) implies that the product of the Weyl denominator o and xy is a function 
on T whose L? norm squared is equal to |W|. However, as we have just 
discussed, oxn is a large sum of torus characters with integer coefficients, 
and the torus characters are orthonormal (Proposition 7.44). So, the L? norm 
squared of oxy is the sum of the squares of the coefficients, and this must 
equal |W|. This means that there can be at most |W | terms in the expression 
for oxy in terms of torus characters. 

On the other hand, there is at least one torus character in the product 
oxn that cannot cancel out, namely e’(“+°), Since this term comes from 
the highest weight in ø and the highest weight in yn, it occurs only once 
and does not cancel out. Meanwhile, the set of weights in o and in xn are 
invariant under the Weyl group, and so the set of weights occurring in oxn is 
also invariant under the Weyl group. So, if e“““+°#) does not cancel out, then 
neither does e\”'(+9).4) w € W. This means that we have |W] terms that 
cannot cancel out in oxn; all other terms must cancel out or the L? norm of 
xn would be larger than 1. 

It remains only to check the coefficients of the terms of the form et% (4+8), H), 
These terms will arise as the product of det(w)e”’'®” in o and m(w-p)e"\“-#) 
in xm. However, we know that u has multiplicity one and that the multiplic- 
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ities are invariant under the action of the Weyl group. So, m(w - u) = 1 and 
the coefficient of e'(’"(+9).#) will be det(w). o 


7.7 Exercises 


1. For any complex number p, consider the Verma module V, for sl(2;C) 
described in Section 7.3.4. Show that if u is not a non-negative integer, 
then V, is irreducible. 

2. Show that every continuous homomorphism of S! into C* is of the form 
e? — et”? for some n € Z. Show that the functions {e’”°},¢z form an 
orthonormal set inside L?($', d6/2r). 

Note: The functions e*”? are the “torus characters” for the one-dimensional 
torus S1. Compare Proposition 7.44. 

3. Let T be the subgroup of SU(n) consisting of diagonal elements. (Then, 
T is a “maximal torus” in SU(n).) Show that every element A of SU(n) 
is conjugate to an element of T. Let W denote the Weyl group for SU(n), 
namely the permutation group acting on T by permuting the diagonal 
elements. Under what conditions is A conjugate to exactly |W| elements 
of T? 

4. If V is an irreducible representation of g and u is a nonzero element of V, 
show that the map u > F,» in (7.34) is injective. 

5. Verify directly the Weyl character formula for the representation of sl(3; C) 
with highest weight (1,2), using the weights and multiplicities of this rep- 
resentation given in Section 5.7. To do this, consider the Weyl denomina- 
tor ø given in Definition 7.45 and the formula for xn given in Proposition 
7.47. Now, compute oxn using that etH) et AF) = etlu+à, H), 

6. This exercise concerns the Borel-Weil construction in the case G = 
SL(n;C). Let Bt denote the subgroup of SL(n;C) consisting of matri- 
ces of the form 

Qı * 
a= a (7.50) 
0 An 


and let N~ denote the subgroup of SL(n;C) consisting of lower triangu- 
lar matrices with ones on the diagonal. Let B denote the matrices of 
the same form as in (7.50) except with the nonzero elements below the 
diagonal. Consider the homomorphism X : B+ > C* given by 


X (a) = a7” (ara2)™? -- (a102 + n1), (7.51) 


where m1, ...,Mn—1 are non-negative integers. For any k = 1,2,...,n—1 
and any g € SL(n;C), let det,(g) denote the determinant of the k x k 
block in the upper left corner of g. 

Now, consider the function F : SL(n;C) => C* given by 
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F(g) = [dets(g)]”™ [deta(g)]"? --- [detn—1(g)|"""- 


Show that F is a polynomial in the entries of g and satisfies 


F (bg) = X(b*)F(g) 


for all b € B- and all g € SL(n;C). (Compare Definition 7.33.) 

Note: It can be shown that the homomorphisms in (7.51) are precisely the 
X „’s corresponding to dominant integral elements ju for sl(n; ©). 

. Verify the Weyl dimension formula for the representations of sl(3; C) with 
highest weights (1, 2), (2,2), and (0, 4), using the weights and multiplicities 
given in Section 5.7. 

. Consider the Cartan subalgebra for sl(n;C) given in Section 6.9 and con- 
sider the system of positive roots described in that section. Show that half 
the sum of the positive roots is an integral element. 


8 


More on Roots and Weights 


8.1 Abstract Root Systems 


In this section (and the next several sections), we consider root systems apart 
from their origins in semisimple Lie algebras. There are many results about 
root systems that are relevant to the understanding of semisimple Lie algebras 
but whose proofs involve only the root systems and not the Lie algebras from 
which they came. Therefore, it is convenient to separate the theory of root 
systems from Lie algebras. 


Definition 8.1. A root system is a finite-dimensional real vector space E 
with an inner product (-,-), together with a finite collection R of nonzero 
vectors in E satisfying the following properties: 


1. The vectors in R span E. 
2. If a is in R, then so is ~a. 
3. If a ts in R, then the only multiples of a in R are a and —a. 
4. Ifa and B are in R, then so is wa:ßB, where Wa is the linear transformation 
of E defined by 
Wa Jadea ily BEE. 
(a, a) 


5. For alla and P in R, the quantity 
(8, a) 


(a, a) 


2 


is an integer. 


The dimension of E is called the rank of the root system and the elements 
of R are called roots. 


Property 2 is actually redundant, since wa(a@) = —a. In the theory of 
symmetric spaces, there arise systems satisfying Properties 1, 2, 4, and 5 but 
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not Property 3. These are called “unreduced root systems.” We will consider 
only “reduced root systems”—those satisfying Property 3 as well as the other 
four properties. 

The map Wwa is the reflection about the hyperplane perpendicular to a; 
that is, Wa :& = —a@ and wa: = @ for all 8 in E that are perpendicular to a, 
as is easily verified from the formula for wa. From this description, it should 
be evident that wa is an orthogonal transformation of E with determinant 
-1. 

We can interpret Property 5 geometrically in one of two ways. In light 
of the formula for wa, Property 5 is equivalent to saying that wa - 8 should 
differ from 8 by an integer multiple of a. Alternatively, if we recall that the 
orthogonal projection of 8 onto a is given by ((8,a)/(a,a))a, we note that 
the quantity in Property 5 is twice the coefficient of a in this projection. Thus, 
Property 5 is equivalent to saying that the projection of 3 onto a is an integer 
or half-integer multiple of a. 

If the rank of the root system is one, then there is only one possibility: 
R must consist of a pair {a,—a}, where a is a nonzero element of E. This 
root system is customarily called “A,” and is shown in Figure 8.1. In rank 
two, there are four possibilities, pictured in Figure 8.2 with their conventional 
names. In the case of A, x Aj, the lengths of the horizontal roots are unrelated 
to the lengths of the vertical roots. In Ag, all roots have the same length; in 
Bg, the length of the longer roots is V2 times the length of the shorter roots; in 
G2, the length of the longer roots is v3 times the length of the shorter roots. 
The angle between successive roots is 90° for A, x A1, 60° for A2, 45° for Bo, 
and 30° for G2. The reader is invited to verify that each of these systems is 
actually a root system. That every root system in rank two is actually one of 
the ones pictured here is proved in Section 8.5. Examples of rank-three root 
systems are given in Section 8.6. 


e 


Fig. 8.1. The root system Aj 


We have shown that one can associate a root system to every complex 
semisimple Lie algebra. It turns out that every root system arises in this way, 
although this is very far from obvious—see Section 8.9. 


Definition 8.2. If (E, R) is a root system, then the Weyl group W of R 
is the subgroup of the orthogonal group of E generated by the reflections wa, 
ae R. 


By assumption, each wa maps R into itself, indeed onto itself, since each 
a E R satisfies a = wa: (Wa: @). It follows that every element of W maps R 
onto itself. Since the roots span Æ, a linear transformation of E is determined 
by its action on R. This shows (compare Proposition 6.29) that the Weyl 
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A,X Ay Ay 
By G, 


Fig. 8.2. The rank-two root systems 


group is a finite subgroup of O(£) and may be regarded as a subgroup of the 
permutation group on R. We follow the notation of Chapter 6 and write the 
action of a Weyl group element w on an element H of E as w. H. 


Proposition 8.3. Suppose (E, R) and (F, S) are root systems. Consider the 
vector space E ® F, with the natural inner product determined by the inner 
products on E and F. Then, RUS is a root system in EF, called the direct 
sum of R and S. 


Here, we are identifying E with the subspace of E @ F consisting of all 
vectors of the form (e,0) with e in E, and similarly for F. So, more precisely, 
RUS means the elements of the form (a, 0) with a in R together with elements 
of the form (0, 8) with 8 in S. (Elements of the form (a, 8) with a € R and 
B € S are not in RUS.) 


Proof. If R spans E and S spans F, then RUS spans E @ F, so Condition 
1 is satisfied. Conditions 2 and 3 are easily verified. For Conditions 4 and 5, 
consider, say, a € R and some root 7 in RUS. If @ is in R, then Conditions 4 
and 5 hold because R is a root system. If 8 is in S, then (3,a) = 0 (since E 
is orthogonal to F inside E @ F) and so wa: 8 = 8 € RUS and the quantity 
in Condition 5 is zero. The same argument applies if @ is in S. o 


Definition 8.4. A root system (E, R) is called reducible if there exists an 
orthogonal decomposition E = E, ® Ez with dim E; > 0 and dim E2 > 0 such 
that every element of R is either in E; or in E2. If no such decomposition 
exists, (E, R) is called irreducible. 


If (E, R) is reducible, then it is not hard to see that the part of R in Fy 
is a root system in E; and the part of R in E> is a root system in E2. So, a 
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root system is reducible precisely if it can be realized as a direct sum of two 
other root systems. In the Lie algebra setting, the root system associated to 
a complex semisimple Lie algebra g is irreducible precisely if g is simple. 


Definition 8.5. Two root systems (E, R) and (F, S) are said to be equivalent 
if there exists an invertible linear transformation A: E — F such that A maps 
R onto S and such that for alla € R and 8B € E, we have 


A (wa: 8) = Waa: AB. 
A map A with this property is called an equivalence. 


Note that the linear map A is not required to preserve inner products, but 
only to preserve the reflections about the roots. It is easily seen that if A is 
an orthogonal transformation of E to F and if A takes R onto S, then A is 
an equivalence. However, not every equivalence is of this form. For example, 
we may take F = E and S = XR, where À is a nonzero real number. (In this 
case (E, S) is again a root system, as is easily verified.) Then, A = AJ is an 
equivalence of (E, R) with (E, S). To see this, note that wy, is the same as 
Wa since the hyperplane perpendicular to Aq is the same as the hyperplane 
perpendicular to a. 

We see, then, that if one multiplies all of the roots in a root system by 
a nonzero constant, one gets another root system that is “the same as” (i.e., 
equivalent to) the original root system. Note that the quantity 2(a, 3)/(a, a) 
is unchanged if both a and 8 are multiplied by the same constant. So, the 
actual lengths of the roots in a root system are not important. Nevertheless, 
the ratios of lengths of different roots in the root systems are important (and 
unchanged if all the roots are multiplied by the same constant). 


Proposition 8.6. Suppose a and GB are roots, a is not a multiple of B, and 
(a, a) > (8,8). Then, one of the following holds: 


1. (a, p) 

2. (a, ar) = (B, 8) and the angle between a and B is 60° or 120° 
3. (a, a) = 2 (B, B) and the angle between a and B is 45° or 135° 
4. (a, a) = 3 (8, 8) and the angle between a and B is 8 or 150° 


So, if two roots are not multiples of one another and are not perpendicular 
to one another, then the ratio of the length of the longer to the length of the 
shorter must be 1, v2, or v3, and for each case, there is only one possible 
acute angle and one possible obtuse angle. If the roots are perpendicular, then 
there is no constraint on the ratio of their lengths. The rank-two examples in 
Figure 8.2 demonstrate that each of the angles and length ratios permitted 
by Proposition 8.6 actually occurs. Figure 8.3 shows the allowed angles and 
length ratios, for the case of an acute angle. 
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llall = 161] Ilo] = v2 Ilall Ilall = v3 Il6ll 
ky 6=45° eet 
1 3 
d=— =— =— 
cos > cos 6 Wa cos 0 5 


Fig. 8.3. Basic lengths and angles 


Proof. Suppose that aœ and £ are roots and let mı = 2(a, 3)/(a, œ) and mz = 
2(8,a)/(6, 6}. Assume (a,a) > (8,8). (If not, reverse the labeling of the 
two.) By Property 5, mı and mz must be integers. Note that 


(a, 8)? 
(a, a) (8, 8) 


where @ is the angle between a and £, and that 


mim = 4 = 4 cos? 0, (8.1) 


m (a,a) 
whenever (a, 6} 4 0. From (8.1), we conclude that 0 < mimo < 4. If mim = 
0, then cos @ = 0, so a and 8 are perpendicular. If mymz2 = 4, then cos? 6 = 1 
so 6 is zero or 180° (i.e., œ and 8 are multiples of one another (and soa = +8)). 

The remaining possible values for mı mg are 1, 2, and 3, which we consider 
in turn. If mymz = 1, then cos? 0 = 1/4, so @ is 60° or 120°. Since m, and 
mz are both integers, we must have mı = 1 and mo = 1 or mı = —1 and 
mz = —1, and in either case, (8.2) tells us that a and 8 have the same length, 
and we get Case 2 of the proposition. 

If mimo = 2, then cos? 0 = 1/2 so @ is 45° or 135°. Since we assume 
(aa) > (8,8), (8.2) tells us that |mə| > |m,|, so we have either mz = 2, 
mı = 1 or mg = —2, mı = —1, and we are in Case 3 of the proposition. 

Finally, if mmz = 3, then cos? 0 = 3/4 and so @ is 30° or 150°. Since 
we assume (a,a) > (3,3) , (8.2) tells us that |m2| > |mi|, so we have either 
m2 = 3, mı = 1 or mg = —3, mı = —1, and we are in Case 4 of the 
proposition. 

In the last three cases, having mı and mz both positive corresponds to an 
acute angle ((a, 3) > 0) and having mı and mg both negative corresponds to 
an obtuse angle (la, 3) < 0). o 


Corollary 8.7. Suppose that a and 8 are roots. If the angle between a and 8 
is strictly obtuse (i.e., strictly between 90° and 180°), then a+ is a root. If 
the angle between a and B is strictly acute (i.e., strictly between 0 and 90°), 
then a — and B — a are roots. 
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Proof. The proof is by examining each of the three obtuse angles and each of 
the three acute angles allowed by Proposition 8.6. Consider first the acute case 
and adjust the labeling so that (a,a) > (8,8). An examination of Cases 2, 
3, and 4 in the proposition (see Figure 8.3) shows that in each of these cases, 
the projection of 8 onto a is equal to ža. (This corresponds to mı = 1 in the 
notation of the proof of the proposition.) In that case, wa: 8 = 8 — a. Thus, 
B—a and a— = —(3—a) are roots. In the obtuse case (with (a,a) > (8, 8)), 
we have the projection of 3 onto a equal to —ta, and, s0, Wa: G=at+f. O 


8.2 Duality 


Definition 8.8. If (E, R) is a root system, then for each root a € R, define 
the co-root Ha by 
a 
Hy, = 2-—~. 
(a, a) 
The set of all co-roots is denoted RY and is called the dual root system to 


R. 


Proposition 6.26 shows that this definition is consistent with the use of 
the term “co-root” in Chapter 6. Property 5 in the definition of a root system 
may be restated as the condition that (8, Ha) be an integer for all roots a 
and 8. We compute that 


_,{a,a) 4 
cag tea ~ (a, @) 
and, therefore, 
Ha 2a \ (a,a) _ 
A Ha) ? (5) 40 Bi Pa 


and so the formula for a in terms of Ha is exactly the same as the formula 
for Ha in terms of a. Furthermore, if we take the inner product of (8.3) with 
Hg, we see that 
Ha, H ; 
C E e (8.4) 
(Ha, Ha) (9, 8) 
This means that the left-hand side of (8.4) is an integer. 
Furthermore, since Ha is a multiple of a, the hyperplane perpendicular to 
a is the same as the hyperplane perpendicular to Ha. This means that the 
reflection associated to Ha is the same as the reflection associated to a (i.e., 
WH, = Wa). However, since wa is an orthogonal transformation, we have 
Wa ` B Wa ' B 
Wa: Hg = 2- = 2-———__ = Huw p. 
Be SD UE e 
This shows that the set of co-roots is invariant under each reflection wa 
(= wH,). This, together with (8.4), shows that RY is again a root system. 
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(The remaining properties of root systems for RY follow immediately from the 
corresponding properties of R.) So, we have established the following result. 


Proposition 8.9. If R is a root system, then RY is also a root system and 
the Weyl group for RY is the same as the Weyl group for R. Furthermore, 
(RY) =R. 


Note from (8.4) that the integer associated to the pair (Ha, Hg) in RY is the 
same as the integer associated to the pair (a, 3) (not (G,q@)) in R. 

It follows from (8.3) that (RY)Y = R; that is, if RY is dual to R, then 
R is also dual to RY. If all the roots in R have the same length L, as in the 
case of the root system associated to sl(n; C), then RY is obtained from R by 
multiplying each of the elements of R by a constant (namely 2/L7). In that 
case, the root system RY will be equivalent to R. Even if not all of the elements 
of R have the same length, it could still happen that RY is equivalent to R. 
This is the case, for example, for the rank-two root systems Bz and Go (Figure 
8.2). In general, however, RY need not be equivalent to R. For example, the 
rank-three root systems Bs and C3 (Section 8.6) are dual to each other but 
not equivalent to each other. 


8.3 Bases and Weyl Chambers 


Definition 8.10. A subset A of R is called a base for R if the following 
conditions hold: 


1. A is a basis for E as a vector space. 

2. Each root a € R can be expressed as a linear combination of elements 
of A with integer coefficients and in such a way that the coefficients are 
either all non-negative or all nonpositive. 


The roots for which the coefficients are non-negative are called positive 
roots and the others are called negative roots (relative to the base A). The 
set of positive roots relative to a fixed base A is denoted R*. The elements of 
A are called the positive simple roots. 


Note that since A is a basis for E, each a can be expressed uniquely as 
a linear combination the elements of A. We require that A be such that the 
coefficients in the expansion of each a € R be integers and such that all the 
nonzero coefficients have the same sign. 


Proposition 8.11. If a and B are distinct elements of a base A for R then, 
(a, 8) <0. 


Geometrically, this means that either a and 8 are perpendicular or the 
angle between them is obtuse. 
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Proof. Since a Æ ĝ, if we had (a, 3) > 0, then the angle between a and 8 
would be strictly between 0 and 90°. Then, by Corollary 8.7, a — 8 would be 
an element of R. Since the elements of A form a basis for E as a vector space, 
each element of R has a unique expansion in terms of elements of A, and the 
coefficients of that expansion are supposed to be either all non-negative or all 
nonpositive. However, the expansion of a— 8 has one positive and one negative 
coefficient. So, a— 8 must actually not be a root, which means (a, 3) <0. O 


It is far from obvious that a base exists. The reader is invited to look at 
the rank-two root systems in Figure 8.2 and find a base for each one. We now 
prove that a base always exists and give a constructive method for finding 
one. 


Proposition 8.12. If E is a finite-dimensional real vector space and R is any 
finite subset of E not containing 0, then there exists a hyperplane V through 
the origin in E that does not contain any element of R. 


Proof. To prove this, we try to find a vector H in E such that (H,a) is 
nonzero for each a in R (in which case H itself must certainly be nonzero). If 
we can find such an H, then we may take V to be the orthogonal complement 
of H, and V will be a hyperplane through the origin and (by the way H is 
constructed) V will not contain any element of R. How do we find H? Well, 
H cannot be contained in any of the hyperplanes Va = {H € E| (a, H) = 0}. 
So, if we can prove that the finite collection of hyperplanes V, cannot fill up 
all of Æ, then there will be points not in any V, and so we can find H and, 
thus, V. 

It remains to prove that the union of the finite collection of hyperplanes 
{Va}aer cannot be all of E. This can be done, for example, using measure 
theory: Each Vą is a set of Lebesgue measure zero in E, and so the union 


of the V,’s is again a set of measure zero, and so the union cannot be all of 
E. o 


Definition 8.13. Let (E, R) be a root system. Let V be a hyperplane through 
the origin in E such that V does not contain any root. Choose one “side” of 
V and let Rt denote the set of roots on this side of V. An element a of Rt 
is called decomposable if there exist B and y in Rt such that a = B + 4; if 
no such elements exist, a is called indecomposable. 


The “sides” of V can be defined as the connected components of the set 
E — V. Alternatively, choose a nonzero element pu of the (one-dimensional) 
orthogonal complement of V. Then, V is precisely the set of H in E for which 
(u, 1) = 0. The two “sides” of V are then the sets {H € E|(u,H) > 0} 
and{ H € E| (u, H} < 0}. Choosing a different nonzero p in V+ (which must 
be a constant multiple of the original choice) either leaves these two sets 
unchanged or interchanges them. 

The notion of indecomposable is understood to be relative to the choice of 
V and the choice of a side of V. This construction of a base for R motivates 
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the use of the phrase “positive simple root” for the elements of the base A. 
In this construction, we first find the set of positive roots; then, the elements 
that are positive and “simple” (i.e., indecomposable) form the base. 


Theorem 8.14. Suppose (R, E) is a root system, V is a hyperplane through 
the origin in E not containing any element of R, and R* is the set of roots 
lying on one fired side of V. Then, the set of indecomposable elements of Rt 
is a base for R. 


The reader is invited to carry out this procedure for each of the rank-two 
root systems listed in Section 8.5. 


Proof. Choose a nonzero vector y in the orthogonal complement of V (which 
is one dimensional) in such a way that y lies on the positive side of V (i.e., the 
side containing Rt). Then, V is simply the set of H € E such that (y, H} = 0 
and the positive side of V is the set of H € E such that (y, H} > 0. Let A 
denote the set of indecomposable elements in Rt. 


Step 1. Every a € R* can be expressed as a linear combination of ele- 
ments of A with non-negative integer coefficients. Suppose not. Then, among 
all of the elements of R* that cannot be expressed in this way, choose a 
so that (y,@) is a small as possible. Certainly a cannot be an element of 
A, so a must be decomposable, a = bı + 32, with 61,62 € Rt. Now, (i 
and z cannot both be expressible as linear combinations of elements of A 
with non-negative integer coefficients, or else œ would be expressible in this 
way. However, (y,a) = (y, 61) + (7, G2), and since the numbers (y, 81) and 
(y, 82) are both positive, they must be smaller than (y, a), contradicting the 
minimality of a. 


Step 2. If a and B are distinct elements of A, then (a, 3) < 0. If we had 
(a, 3) > 0, then by Corollary 8.7, a— 8 and 8 — a would both be roots, one of 
which would have to be positive. If a — 3 were positive, then we would have 
a = (a — 2) + 8 and a would be decomposable. If 8 — a were positive, then 
we would have 8 = (8 — a) + a, and a would be decomposable. Since a and 
8 are assumed indecomposable, we must have (a, 3) < 0. 


Step 3. The elements of A are linearly independent. Suppose we have 
X caa = 0 (8.5) 
acA 


for some collection of constants ca. Then, we may separate the sum into those 
terms where ca > 0 and those where Ca = —da < 0 and we obtain 


X caa = Y dop (8.6) 


where the sums range over disjoint subsets of A and where ca > 0 and da > 0. 
Let u = }_ caa. From (8.6), we have 
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(u,u) = 2 caa, X. dop} 
=J} cada(a, p). 


However, ca and da are non-negative and (by Step 2) (a, 3) < 0. So, (u, u) < 0, 
which can only happen if u is zero. 

Now, if u = 0, then (y, u) = X` ca (y, a) = 0, which implies that all the ca’s 
are zero since Ca > 0 and (y,a) > 0. The same sort of reasoning shows that 
all the da’s are zero, so that all of the coefficients in the original expansion 
(8.5) are zero. This shows that A is linearly independent. 


Step 4. A is a base. We have shown that A is linearly independent and 
that all of the elements of R* can be expressed as linear combinations of 
elements of A with non-negative integer coefficients. The remaining elements 
of R, namely the elements of R~, are simply the negatives of the elements 
of Rt, and so they can be expressed as linear combinations of elements of A 
with nonpositive integer coefficients. Since the elements of R span E, A must 
also span E and it is a base. 0 


Theorem 8.15. Given any base A for R, there exists a hyperplane V such 
that A arises as in Theorem 8.14. 


Proof. If A = {a1,...,@r} is a base for R, then A is a basis for E in the 
vector space sense. Then, by elementary linear algebra, for any sequence of 
numbers c1, . . . , Cr there exists a unique y € E with (y,ax) = ch, k =1,...,7. 
In particular, we can choose y so that (y,a,) > 0 for k = 1,...,r. Then, 
if R* denotes the positive roots with respect to A, we will have (y,a) > 0 
for all a € Rt, since a is a linear combination of elements of A with non- 
negative coefficients. So, all of the elements of R* lie on the same side of the 
hyperplane V = {H € E |(y, H) = 0}. It is not hard to see that the elements 
of the original base A are indecomposable and so are contained in the base 
constructed in Theorem 8.14. However, the number of elements in a base must 
equal dim F, so, actually, A is the base constructed in Theorem 8.14. o 


Proposition 8.16. [fA is a base for R, then the set of all co-roots Ha, a € A, 
is a base for the dual root system RY. 


Proof. Choose a hyperplane V such that the base A for R arises as in Theorem 
8.14, and call the side of V on which A lies the positive side. Let Rt denote 
the set of positive roots in R relative to the base A. Then the co-roots Ho, 
a € Rt, also lie on the positive side of V, and all the remaining co-roots lie 
on the negative side of V. Thus, applying Theorem 8.14 to RY, there exist 
Bi,- --, Br in R* such that Hg,,...H,, form a base AY for RY. We want to 
show that {(1,...,6-} = {a1,...,ar}. 

Now, since the (’s are in R+, they can be expanded as linear combinations 
of the a’s with non-negative (integer) coefficients. On the other hand, for each 
ak E A, Ha, lies on the positive side of V, and so Ha, is a positive root relative 
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to the base AY for RY. This means that Ha, can be expanded in terms of 
Hg,,...,Hg, with non-negative (integer) coefficients. Since each co-root is 
simply a positive multiple of the corresponding root, it follows that a, can be 
written as a linear combination of (,,..., 3, with non-negative coefficients. 
We conclude, then, that each @; can be expanded in terms of the a’s with 
non-negative coefficients and each a, can be expanded in terms of the 8’s with 
non-negative coefficients. Now, suppose there is some element of {(31,..., 8r}, 
which we may call 64, that is not an element of A = {aj,...,a,}. Then (4 
cannot be a multiple of any az, since if 3, were equal to some a, then 681 
would be in A and if 3; were equal to —a, then 8ı would be on the negative 
side of V. Thus, the expansion of 3, in terms of the a’s must have at least 
two nonzero coefficients. Then choose some a, so that the expansion of a, in 
terms of the §’s has a nonzero coefficient for 3,. (There must exist such an 
&æk, or else all the a’s would be contained in the span of 62, ..., 8r and the a’s 
would not span Æ.) Now, expand a, in terms of the 8’s and then expand each 
of the @’s in terms of the a’s. Since all the coefficients in both expansions are 
non-negative and since the coefficient of 3, in the expansion of a, is nonzero, 
we will get an expansion of a, in terms of the a;’s in which at least two of 
the coefficients are nonzero. This contradicts the linear independence of the 
a’s, and so we must, after all, have {(1,...,3,} = {a1,...,ar}. o 


Definition 8.17. If A = {a1,...,@r} is a base for R, then the open fun- 
damental Weyl chamber in E (relative to A) is the set of all H in E 
such that (ak, H) > 0 for all k = 1,...,r. The closed fundamental Weyl 
chamber in E (relative to A) is the set of all a in E such that (ap, œ) > 0 
forallk=1,...,r. 


Since @1,...,@r form a basis for E in the vector space sense, elementary 
linear algebra shows that for any collection a,,...,a, of real numbers, there 
exists a unique vector H € E with (a;,H) = a;, i = 1,...,r. This shows that 
the fundamental Weyl chamber (open or closed) is nonempty. It is easily seen 
that the open fundamental Weyl chamber is an open set in E and that its 
closure is the closed fundamental Weyl chamber. 


Definition 8.18. For each a € R, let Va denote the hyperplane perpendicular 
to a. Then, an open Weyl chamber in E (relative to R) is a connected 
component of the set 

E- |] Va. 


aER 
The following result shows that the use of the term “Weyl chamber” in 
Definition 8.18 is consistent with its use in Definition 8.17. (The remaining 
results of this section are offered without proof.) 


Proposition 8.19. For each open Weyl chamber C, there exists a unique base 
Ac for R such that C is the open fundamental Weyl chamber associated to 
Ac. The positive roots with respect to Ac are precisely those elements a of 
R such that a has positive inner product with each element of C. 
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So, there is a one-to-one correspondence between Weyl chambers and 
bases. Given a base A, we construct a Weyl chamber by looking at vectors 
having a positive inner product with each element of A. Given a Weyl cham- 
ber C, we define R* to be the set of roots a that have positive inner product 
with each element of C and we then take A to be the set of indecomposable 
elements in Rt. 


Theorem 8.20. The Weyl group acts simply and transitively on the set of 
bases and also on the set of open Weyl chambers. 


This means, in particular, that the base is unique up to the action of the 
Weyl group. 
8.4 Integral and Dominant Integral Elements 


Definition 8.21. An element of E is called an integral element if for all 
a in R, the quantity 


is an integer. 


Definition 8.22. If A is a base for R, then an integral element u is called 
dominant integral if 

(u,a) 
(a, a) 


2 >0 


for alla € A. A dominant integral element is called strictly dominant if 


for allae A. 


This means that an integral element is dominant if and only it is contained 
in closed fundamental Weyl chamber and strictly dominant if and only if it is 
contained in the open fundamental Weyl chamber. 


Proposition 8.23. If 1. € E has the property that 


(u,a) 
*(a,0) 


is an integer for alla € A, then the same holds for alla € R and, thus, u is 
an integral element. 


Proof. This follows from Proposition 8.16. m 


8.4 Integral and Dominant Integral Elements 255 


Suppose y is an element of E for which 2(p,a)/(a,a@) is a non-negative 
integer for all a in A. Then, the proposition tells us that u is an integral 
element and it then follows that u is a dominant integral element. 


Definition 8.24. Let A = {a1,...,a,} be a base. Then, the fundamental 
weights (relative to A) are the elements 111,..., ur with the property that 


geen) _ gs kl=1,...,7. (8.7) 


Let us see that there really are elements with this property. Fix k with 
1 < k <r and let Vp be the span of a1,...,@-1,@k41,---,@r- Then, Vk 
is a (r — 1)-dimensional subspace of E and the orthogonal complement V+ 
of Vp is one-dimensional. If u is a nonzero element of V,+, then u cannot be 
orthogonal to ax, or else u would be orthogonal to all of the a’s and, hence, 
to every element of E, including p itself. Now, set 


1 (a, on) 


PE oP oe) 


and up will be the kt? fundamental weight. (The same sort of argument shows 
that the fundamental weights are unique.) 

Geometrically, the kt” fundamental weight is the unique element of E that 
is perpendicular to each a;, | Æ k, and whose orthogonal projection onto a, 
is one-half of ay. The reader should look back at the picture of the domi- 
nant integral elements for sl(3;C) in Figure 5.2, identify the two fundamental 
weights, and verify that the fundamental weights have this property. 

Once we have found the fundamental weights, then the set of dominant 
integral elements is precisely the set of linear combinations of the fundamental 
weights with non-negative integer coefficients, and the set of all integral ele- 
ments is the set of linear combinations of fundamental weights with arbitrary 
integer coefficients. 

Note that every root is an integral element and, therefore, any linear com- 
bination of roots with integer coefficients is also an integral element. However, 
it is not necessarily the case that every integral element is an integer linear 
combination of roots—see Section 8.10. 


Definition 8.25. If A = {a1,...,a,} is a base, then an element uo is said 
to be higher than u if uo — u can be expressed as 


Ho — H = G10, + +++ + arar, 


with each a; > 0. We equivalently say that u is lower than uo and we write 


this relation as po = p or p < Lo. 


It is understood that the notion of higher and lower is relative to the 
chosen base A. 
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8.5 Examples in Rank Two 


8.5.1 The root systems 


Figure 8.2 shows four examples of rank-two root systems. We now prove that 
every rank-two root system is equivalent to one of these four. So, let R C R? be 
a root system. Let 0 be the smallest angle occurring between any two vectors 
in R. Since the elements of R span R*, we can find two linearly independent 
vectors a and £ in R. If the angle between a and (3 is greater than 90°, then 
the angle between a and —( is less than 90°; thus, the minimum angle @ is at 
most 90°. Then, according to Proposition 8.6, 9 must be one of the following: 
90°, 60°, 45°, 30°. 

Let a and 8 be two elements of R such that the angle between them is 
the minimum angle 0. I claim that for each integer n there must be a unique 
element of R whose angle with a is n0. To see this, apply to a the reflection 
wg about the line perpendicular to 3. Then, the vector —wg(a) will be a 
vector that is at angle 0 to 3 but on the opposite side of @ from a, as shown 
in Figure 8.4. Thus, —wg(a) is at angle 26 to a. If we then apply to @ the 
reflection associated to the vector wg(a), the negative of this vector will be a 
vector at angle 30 to a. Continuing in the same way, we can obtain vectors 
at angle nð to a for all n. These vectors are unique since a nontrivial positive 
multiple of a root is not allowed to be a root. 

Note that all of the possible values of 6 evenly divide 360°, so, eventually, 
we will come around to a again. Note also that the vector at angle 26 to a 
must have the same length as a, since it is obtained from a by applying the 
(length-preserving) reflection wg. The same sort of reasoning shows that any 
two vectors in R whose angle is an even multiple of 0 must have the same 
length. For pairs of vectors whose angle is an odd multiple of 6, the ratio of 
the lengths must be consistent with Proposition 8.6. 


| 
| 
| 
wg (a) ! Q 
I 
l 
l 


—w 3(a) 


Fig. 8.4. a and —w,(a) 


We now consider, in turn, each of the possible values for the minimum 
angle 6. 
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Case 1: 6 = 90°. In this case, the ratio of the lengths of perpendicular 
vectors is undetermined, so we have a pair of vectors œ and —a and a per- 
pendicular pair of vectors 8 and —8 , with no restrictions on the lengths of a 
and 8. 

Case 2: 0 = 60°. In this case, Proposition 8.6 tells us that all vectors in 
R must have the same length, since the angle between nonparallel elements 
must be either 60° or 120°. Thus, we have six vectors of the same length, each 
one at an angle of 60° to the adjacent ones. 

Case 3: 0 = 45°. In this case, Proposition 8.6 tells us that the vectors 
at 45° to each other must have a length ratio of v2 (or 1/V2). Meanwhile, 
vectors at an angle of 90° to each other must have the same length since one 
can be obtained from the other by the reflection associated to the vector at 
45° to both of them. So, we have eight vectors at 45° angles, with lengths 
alternating between a shorter length L and a longer length V/2L. 

Case 4: 0 = 30°. This case is similar to the previous one except that now 
the ratio of lengths of consecutive vectors must be V3 rather than 2. So, we 
have 12 vectors at 30° angles, with lengths alternating between some shorter 
length L and a longer length /3L. 

These four cases correspond to the root systems A; x A1, A2, Bo, and G2 
(in that order), as pictured in Figure 8.2. It is left as an exercise to verify 
that each of these collections of vectors is actually a root system. This is 
true essentially because for any pair of vectors in each system, the angles and 
length ratios are consistent with Proposition 8.6. 

We also need to show that any two root systems falling under a particular 
case are equivalent. This is perhaps least obvious for Case 1. If we have one 
root system Rı = {+a1,+,} (with a; perpendicular to 81) and another 
Ry = {4a2, +2} (with a2 perpendicular to 32), then there is a unique linear 
map A from R? to R? that takes a; to ag and ĝı to 62. This map will be the 
desired equivalence, as is easily verified. Note that A need not be an isometry. 
The verification for the three remaining cases is left as an exercise—in those 
cases the equivalence will be a combination of a rotation and a dilation. 


8.5.2 Connection with Lie algebras 


The root systems A, x A;, Ag, and Bz arise as root systems of “classical” Lie 
algebras as follows. The root system A; x A, is the root system of so(4; C), 
which is isomorphic to sl(2; C) @sl(2;C); Ag is the root system of sl(3;C); and 
Bg is the root system of so(5;C), which is isomorphic to sp(2; C). The root 
system G2 is the root system of an “exceptional” Lie algebra, also called G2. 
The Lie algebra Gp can be represented as a Lie algebra of matrices (indeed 
every Lie algebra can!), but not in any particularly simple way. 


8.5.3 The Weyl groups 


We now compute the Weyl group in each case. Suppose our root system has n 
elements, in which case 6 = 360/n. There will then be n/2 reflections, one for 
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each pair ta of roots. If œ and 8 are roots with angle ¢ between them, then 
the composition of the rotations wa and wg will be a rotation by angle +2¢, 
with the direction of the rotation depending on the order of the composition. 
To see that this is the case, note that wa and wg both have determinant —1 
and so Wawg has determinant 1 and is, therefore, a rotation by some angle. 
To determine the angle, it suffices to apply wawg to any nonzero vector, for 
example, 3. However, wg(8) = —8 and wa(—£) is the vector at angle to 
a but on the opposite side of a as 3, hence at angle 2¢ to 8. Then, the 
composition of a reflection wa and a rotation by angle 2¢ will be another 
reflection wg, where ( is a root at angle ¢ to a. (I leave it to the reader to 
verify this.) Thus, the set of n/2 reflections together with rotations by integer 
multiples of 20 form a group; this is the Weyl group of the rank-two root 
system. 

Therefore, if there are n elements in the rank-two root system, then the 
Weyl group consists of the n/2 reflections together with n/2 rotations, namely 
rotations by even integer multiples of 6 = 360°/n. The Weyl group is, there- 
fore, the dihedral group on n/2 elements (i.e., the symmetry group of a regular 
(n/2)-gon). Note that in the case of Ap the Weyl group consists of three re- 
flections together with three rotations (by multiples of 120°). In this case, the 
Weyl group is not the full symmetry group of the root system: Rotations by 
60° map R onto itself but are not elements of the Weyl group. 


8.5.4 Duality 


Recall that the map a — 2a/(a,qa) turns each root system into a new root 
system that is called the dual of the original root system. This new root system 
in general may not be equivalent to the original root system. However, in the 
rank-two case, one can see that the dual root system is always equivalent to 
the original system. Recall that duality makes long roots short and short roots 
long. So, in the case of B2, the dual root system can be converted back to the 
original one by rotating by 45° and then rescaling all roots by a constant, and 
similarly for Go. 


8.5.5 Positive roots and dominant integral elements 


For each of the four rank-two root systems, we select a base {a1, a2} and then 
indicate the associated fundamental weights 1 and u2. Here, u is orthogonal 
to @2 and its orthogonal projection onto a; is one-half of a;, whereas ug is 
orthogonal to a, and its orthogonal projection onto a2 is one-half of ag. The 
case of Az was shown in Figure 5.2. We now consider the remaining three 
cases. In each of Figures 8.5, 8.6, and 8.7, the fundamental weights for the 
relevant root system are circled and the other dominant integral elements 
are indicated by black dots. The dashed lines indicate the boundary of the 
fundamental Weyl chamber. The grid in the background indicates the set of 
all integral elements. Note that Bə is presented here rotated by 45° from its 
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orientation in Figure 8.2. This rotation allows the set of integral elements to 
be a square lattice with edges oriented horizontally and vertically. Note also 
that A; x A, is presented here with all roots having the same length, even 
though they are not required to do so. The reader should verify that, in each 
case, the set {a1, @2} is actually a base. For example, in the case of Go, the 
positive roots are a1, Q2, @2 + a1, A2 + 201, @2 + 3Q1, and 2a + 3aj, and 
the remaining roots are the negatives of these six. 
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Fig. 8.6. Roots and dominant integral elements for B2 


8.5.6 Weight diagrams 


Since we have already seen (in Chapter 5) weight diagrams for the Ag case (i.e., 
the Lie algebra sl(3; C)), we will content ourselves here with one representative 
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Fig. 8.7. Roots and dominant integral elements for G2 


diagram for each of the remaining three rank-two cases, A; x Aj, Bo, and G2 
(corresponding to the Lie algebras so(4;C) = sl(2;C) @ sl(2;C), so(5;C) = 
sp(2; C), and G2). 

We make use of Theorem 7.41, which asserts that if m is representation 
with highest weight uo, then p is a weight for 7 if and only if (1) u is contained 
in the convex hull of the orbit of 9 under the Weyl group and (2) fo — u can 
be expressed as a linear combination of roots with integer coefficients. Note 
that in most cases, there will be integral elements contained in the convex 
hull of the orbit of po that are not weights of m because they do not satisfy 
the second property in Theorem 7.41. In the case of A; x A;, the Weyl group 
is generated by reflections about the x-axis and the y-axis, so the orbit of a 
typical point is a rectangle. For Bz, every element of the Weyl group is either 
a rotation by a multiple of 90° or a combination of the reflection about a; 
and such a rotation. So, to obtain the orbit of uo, we first look at the two 
points Uo and We, ' Ho and then at the eight points obtained by rotating po 
and We, ' Ho by multiples of 90°. (If uo is on the boundary of the fundamental 
Weyl chamber, then these eight points will not be distinct.) For G2, the orbit 
of uo is obtained by looking at rotations by multiples of 60° applied to the 
two points uo and Wa, - Ho, yielding 12 (generically distinct) points. 

In each figure, a black dot indicates a weight of the given representation 
and the highest weight is circled. A number next to a dot indicates the mul- 
tiplicity of the corresponding weight. A dot without a number indicates a 
weight of multiplicity one. One set of dashed lines indicates the boundary of 
the fundamental Weyl chamber and another set of dashed lines indicates the 
boundary of the set of points lower than the highest weight. 
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Fig. 8.10. Typical weight diagram for G2 
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8.6 Examples in Rank Three 


In rank three, we can have a reducible root system, which must be a direct 
sum of A; with one of the rank-two root systems described in the previous 
section. In this section, we will consider only the irreducible root systems of 
rank three. There are, up to equivalence, three irreducible root systems in rank 
three, customarily denoted A3, B3, and C3. They all arise as root systems of 
classical Lie algebras. The root system A3 comes from the Lie algebra sl(4; C) 
(or so(6; C), which is isomorphic to sl(4;C)). The root system B3 comes from 
the Lie algebra so(7;C) and the root system C3 comes from the Lie algebra 
sp(3;C). The connection with Lie algebras is described in greater detail later 
in this section and in Section 8.8. 

The color plates (p. 162) show models related to rank-three root systems, 
built using the Zome system, available at www.zometool.com. Many more 
images can be seen on the author’s web site, at www-.nd.edu/~bhall. The 
images in the color plates were modeled using the vZome program, available 
free from the programmer, Scott Vorthmann, at www.vorthmann.org/zome. 
The models were then rendered in the POV program by Charles Albrecht. 
The reader is encouraged to obtain enough Zome to make the root systems 
for him- or herself. The root systems require the “green lines” which are not 
part of the simplest Zome kits. The model of the dominant integral elements 
for C3 (Plate 6) also makes use of “half-length blue lines” which are available 
by special order. (However, one can easily “fake it” using only whole blue 
lines.) In each plate, the roots marked with red balls form a base for the 
relevant root system. 

The first three color plates show the root systems A3, B3, and C3. Let 
us make a few observations about these root systems. First, B3 and C3 are 
obtained by adding more vectors to A3. In higher dimensions, Bn and Cy 
are obtained by adding vectors to Dn, not to An. However, we have a low- 
dimensional coincidence in rank three, namely that A3 is the same as D3, 
reflecting that sl(4;C) is isomorphic to so(6; C). 

Second, B3 and C3 are dual to each other. Specifically, let us normalize 
the inner product so that the green lines have length v2. In that case, the 
blue lines in B3 (Plate 2) have length 1. Thus when we replace each root a 
by 2a/(a,a), the green lines are unchanged and the blue lines are replaced 
by blue lines of twice the length. This gives C3 (Plate 3). 

Third, in both B3 and C3, the long roots by themselves form a root system, 
as do the short roots by themselves. For B3, the long roots form A3 and the 
short roots form A; x A; x A4; for C3, it is the reverse. 

I will not make any attempt in this section to prove that these are the only 
irreducible root systems in rank three. See the next section for a discussion of 
the classification of root systems (and semisimple Lie algebras). 

The calculations in Section 8.8 will show that each of these rank-three root 
systems arises from one of the Lie algebras sl(4;C) = so(6; C), so(7; C), and 
sp(3; C), as follows. 
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Case 1: sl(4;C) or so(6; C). In these two cases (which are-isomorphic), one 
obtains 12 roots, all of the same length and with all angles between nonparallel 
roots equal to 60°, 90°, or 120°. This is the A3 root system. 

Case 2: so(7;C). In this case, one gets 12 “long” roots that form, among 
themselves, a root system isomorphic to A3. One gets, in addition, six “short” 
roots that are shorter by a factor of v2 than the long roots. The short roots 
come in pairs of the form ta, with each pair perpendicular to the other two 
pairs of short roots, so that the short roots among themselves form a root 
system isomorphic to A; x A; x A;. The total root system (including both 
long and short roots) is the B3 root system. 

Case 3: sp(3;C). This case is the same as Case 2 except that the 12 roots 
forming A3 are short and the 6 roots forming A; x A; x A; are longer by a 
factor of 2 than the 12 short roots. This is the C3 root system, which is the 
dual root system to B3. 

We now look at the set of dominant integral elements for each of the three 
irreducible root systems in rank three. In each case, we identify three positive 
simple roots (indicated by a red ball), denoted a1, a2, and a3, and we then 
consider the associated fundamental weights, denoted u1, p2, and u3. This 
means that j1 is orthogonal to a2 and a3 and that the orthogonal projection 
of uı onto ay, is equal to ła, and similarly for u2 and u3. In each case, the 
plate shows elements of the form n; yı +n2"2+nN3"3, with each ng ranging over 
the set {0, 1,2}. In the case of A3 (Plate 4) we obtain two yellow fundamental 
weights and one blue one. In the case of B3 and C3 (Plates 5 and 6), we 
obtain one green fundamental weight, one blue one, and one yellow one. ‘The 
directions for both roots and weights are the same in B3 as they are for C3; 
only the lengths have changed. (Compare Proposition 6.37.) 

Plate 7 shows a representative weight diagram for the Lie algebra sl(4; C), 
whose root system is A3. The highest weight of this representation is the sum 
of the three fundamental weights. (The image includes one “cell” of the set 
of dominant integral elements for reference.) The weights of this representa- 
tion split up into four orbits under the Weyl group. The 24 vertices of the 
outer polyhedron make up a single orbit, and each weight in this orbit has 
multiplicity 1. The centers of the 8 hexagons break up into two orbits of the 
Weyl group, with four elements each. Each of these weights has multiplicity 
2. Finally, the 6 vertices of the inner octahedron make up a single orbit, and 
the weights in this orbit have multiplicity 4. Counting the weights with their 
multiplicities shows that the dimension of the representation is 64. 


8.7 Additional Properties 


There are many other properties of roots and the Wey! group that are known 
and worth knowing. We have focused, up to now, on the most essential prop- 
erties and on examples of root systems. We now list some of the remaining 
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properties without proof. Proofs may be found in any number of standard ref- 
erences, including Chapter III of Humphreys (1972) and Chapter V of Brécker 
and tom Dieck (1985). (See also Chapter V of Serre (1987).) In all of these 
properties, R is a root system in the Euclidean space FE, A is a fixed base of 
R, C is the open fundamental Weyl chamber, and C is the closed fundamental 
Weyl chamber. 


1. Every root is an element of some base. 

2. If R is irreducible, then the Weyl group acts irreducibly on E. 

3. If R is irreducible, then at most two different lengths of roots can arise in 
R and the Wey] group acts transitively on each length of root. 

4. The reflections wa, a € A, generate the Weyl group for R. 

5. If a is an element of A and @ is a positive root different from a, then 
Wa: B is a positive root; that is, Wa permutes the positive roots different 
from a. 

6. Each orbit of the Weyl group contains exactly one point in Ĉ. 

7. If u € E is not contained in any of the hyperplanes perpendicular to the 
roots, then they Weyl group acts freely on p. 

8. If zo is an element of C, then uo > w - uo for all w € W. 

9. If wo € C, then for all y € E, p is contained in the convex hull of the 
W-orbit of wo if and only if w - u < po for all w € W. 

10. Let ô denote half the sum of the positive roots. Then, for each positive 


simple root a, 
2 (a, 6) 


(aa) ~ 
Thus, 6 is a strictly dominant integral element and every strictly dominant 
integral element u can be expressed as u = À + 6, where A is a dominant 
integral element. 


Let us consider examples of the ways in which these properties of root 
systems can be used. In Section 7.3, we needed to show that the weights of a 
certain quotient representation V,,/U, were invariant under the action of the 
Weyl group. To do this, it is sufficient to show that the weights are invariant 
under the action of some genérating set for W, and we use Property 4, that 
the wa’s with a in the base A are a generating set. It is not feasible to prove 
directly that the weights are invariant under all the wa’s, a € R; we make use 
of special properties of the w,’s, a € A. 

As another example, we use Property 9 to show that all the weights of 
the irreducible finite-dimensional representation with highest weight jo are 
contained in the convex hull of the W-orbit of po (Theorem 7.41). We use 
Properties 7 and 10 in the proof of the Weyl character formula to show that 
the weights w - (uo + ô), w € W, are distinct. We also use the integrality of 
6 (part of Property 10) to show that the numerator and denominator of the 
Wey] character formula are well-defined functions on T (in the case that K is 
simply connected). 
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See the exercise for additional examples of how the results in this section 
can be used. 


8.8 The Root Systems of the Classical Lie Algebras 


In this section, we consider the root systems of the “classical” complex 
semisimple Lie algebras, namely sl(n; C), so(n; C), and sp(n; C). We have al- 
ready worked out in detail the case of sl(n;C) in Section 6.9. The root system 
for sl(n;C) is denoted A,_,. The analysis of the orthogonal algebras so(n; C) 
builds on the calculations in Exercises 12 and 13 of Chapter 6 and is divided 
into the cases n even and n odd. The analysis of the symplectic algebras 
sp(n; C) builds on Exercise 14 of Chapter 6. 


8.8.1 The orthogonal algebras so(2n; C) 


The root system for so(2n; C) is denoted Dn. We consider so(2n; C), the space 
of 2n x 2n skew-symmetric complex matrices, with compact real form so(2n), 
the space of 2n x 2n skew-symmetric real matrices. We consider in so(2n) the 
maximal commutative subalgebra t consisting of 2 x 2 block-diagonal matrices 
in which the kt? diagonal block is of the form 


0 ak 
(25) 9 
for some a, E€ R. We then consider the Cartan subalgebra h = t + it of 
so(2n;C), which consists of 2 x 2 block-diagonal matrices in which the k*t? 
diagonal block is of the form (8.8) with a, € C. (The calculations in the next 
two paragraphs show that so(2n;C) decomposes as a direct sum of and 
root spaces gq corresponding to (nonzero) elements a € h*. It follows from 
this that t is actually a mazimal commutative subalgebra of so(2n), which is 
not obvious from the definition of t. Similar remarks apply in the next two 
subsections. ) 
The root vectors are now 2 x 2 block matrices having a 2 x 2 matrix C in 
the (k,l) block (k < l), the matrix —C* in the (l, k) block, and zero in all 
other blocks, where C is one of the four matrices 


1 i 1 -i 
ae Ge (24) 


A little calculation shows that these are, indeed, root vectors and that the 
corresponding roots are the linear functionals on b given by i(a,+a,), —i(ak + 
a), i(ak — a), and —i(az, — a), respectively. (Compare Exercise 12 in Chapter 
6.) 
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We may consider the inner product (X,Y) = trace(X*Y) on so(2n;C), 
which is invariant under the adjoint action of SO(2n). If we use this inner 
product to identify h* with h, then the roots are thought of as elements of 
h instead of h*. Let ©, denote the 2 x 2 block-diagonal matrix whose k* 


diagonal block is 
01 
—10 


and whose other diagonal blocks are zero. The roots (as elements of b) are 
then the matrices : 
i 
2 
with 1< k < l< n. Each of the roots has length 1 with respect to the given 
inner product. The inner product of (i/2)(+0, + ©,) with (¢/2)(£O, + Oy) 
is zero if the set {k,l} is disjoint from {k’,l’}, and the inner product is +1/2 if 
the intersection of {k,l} and {k’,1’} has one element. The root (i/2)(Ox — O1) 
is orthogonal to the root (i/2)(Ox% + ©). 
As a base, we may take the n — 1 roots 


(+0, + ©) ; 


5(O1 — 2), 5(O2— Os), 5 (On-2 = On-1), 3 (@n1- On) (8.9) 


2 


together with the one additional root, 


5 (On1+ On). (8.10) 


Note that for 1 < k < l < n, we have the following formulas: 


Ox — 81 = (Ox — Ox+1) + (Ok+1 — Onze) +--+ + (O1-1 — 9), 
Ok + On = (Ok e On-1) + (On-1 + On) , 
Ox +O; = (Ok + On) + (QO; - On). 


This shows that every root of the form (i/2)(@, — O1) or (i/2)(Ox + 91) 
(k < l) can be written as a linear combination of the base in (8.9) and (8.10) 
with non-negative integer coefficients. The roots of this form are then positive 
and the remaining roots are negative. 

Two consecutive roots in the list (8.9) have an angle of 120° and two 
nonconsecutive roots in the list (8.9) are orthogonal. The angle between the 
root in (8.10) and the second-to-last element in the list (8.9) is 120°; the root 
in (8.10) is orthogonal to all the other roots in (8.9). 


8.8.2 The orthogonal algebras so(2n + 1; C) 


The root system for so(2n+ 1; C) is denoted Bn. We consider so(2n+1; C), the 
space of (2n + 1) x (2n +1) skew-symmetric complex matrices, with compact 
real form so(2n + 1), the space of (2n + 1) x (2n + 1) skew-symmetric real 
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matrices. We consider in so(2n + 1) the maximal commutative subalgebra t 
consisting of block diagonal matrices with n blocks of size 2 x 2 followed by 
one block of size 1 x 1. We take the 2 x 2 blocks to be of the same form 
as in so(2n) and we take the 1 x 1 block to be zero. The associated Cartan 
subalgebra h of so(2n + 1; C) is then matrices of the same form as in t except 
that the off-diagonal elements of the 2 x 2 blocks are permitted to be complex. 

The Cartan subalgebra in so(2n + 1;C) is identifiable in an obvious way 
with the Cartan subalgebra in so(2n;C). In particular, both so(2n;C) and 
so(2n + 1; C) have rank n. With this identification of the Cartan subalgebras, 
every root for so(2n; C) is also a root for so(2n +1; C). There are 2n additional 
roots for so(2n + 1;C). The root vectors for these additional roots are as 
follows. First, the matrices having 


a-(' 


in entries (2k, 2n+1) and (2k+1, 2n+1) and having — Bf" in entries (2n+1, 2k) 
and (2n + 1,2k+ 1). Second, the matrices having 


me) 


in entries (2k,2n + 1) and (2k + 1,2n +1) and —B§" in entries (2n + 1, 2k) 
and (2n + 1,2k +1). The corresponding roots, viewed as elements of h*, are 
given by tay and —ia,. (Compare Exercise 13 in Chapter 6.) 

Let ©; have the same meaning as in the previous subsection, except that 
now ©, is a (2n + 1) x (2n + 1) matrix. We use the inner product (X,Y) = 
trace(X*Y), which is invariant under the adjoint action of SO(2n + 1), to 
identity h* with §. In that case, the additional roots for the so(2n + 1; C) case 
are given by 


a 
+-Ox. 
90k 
These additional roots have length 1/./2 with respect to the given inner prod- 
uct, whereas the roots that are the same as for so(2n; C) have length 1. 
As a base for our root system, we may take the n — 1 roots 
3 (01 = 02), 3 (02 ac Os), a Sg 5 (On—2 > On-1) ’ 9 (On-1 oe On) (8.11) 
(exactly as in the so(2n;C) case) together with the one additional root, 


i 
= i .12 
z2r (8.12) 


The positive roots are those of the form (i/2)(O, — ©:) or (i/2)(Ox + ©:) 
(k < L) and those of the form (i/2)@, (1 < k < n). As in the so(2n; C) case, 
consecutive roots in the list (8.11) have an angle of 120°, whereas noncon- 
secutive roots on the list (8.11) are orthogonal. Meanwhile, the root in (8.12) 
has an angle of 135° with the last root in (8.11) and is orthogonal to the 
remaining roots in (8.11). 
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8.8.3 The symplectic algebras sp(n; C) 


The root system for sp(n;C) is denoted C,. We consider sp(n;C), the space 
of 2n x 2n complex matrices X satisfying JX* J = X, where J is the 2n x 2n 


matrix 
Ol 
Te) 


Explicitly, the elements of sp(n; C) are matrices of the form 


A B 
i & Ae) 
where A is an arbitrary n x n matrix and B and C are arbitrary symmetric 
matrices. We consider the compact real form sp(n) = sp(n;C) N u(2n). 


We consider the maximal commutative subalgebra t of sp(n) consisting of 
matrices of the form 


—1An 


with a1,...,@, E R. We then consider the Cartan subalgebra h = t + it of 
sp(n; C), which consists of matrices of the same form but with a1,...,an € C. 

Let Ex; denote the n x n matrix whose (k,l) entry is one and whose other 
entries are zero. Then, the 2n x 2n matrices of the block form 


0 Epi + Eik 0 0 
e 0 ) , & + Eik (Ba) 


(k # 1) are root vectors for which the corresponding roots are i(ap + a) and 
—i(a, + a). Next, matrices of the block form 


Ext + Eik 0 
( i A (8.14) 


(k # l) are root vectors for which the corresponding roots are i(ap — a). 
Finally, matrices of the block form 


Gabel Cay (8.15) 


are root vectors for which the corresponding roots are 2ia, and —2ia,. 
We may use the inner product (X,Y) = trace(X*Y) on sp(n;C), which 
is invariant under the adjoint action of Sp(n) C U(2n). If we use this inner 
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product to identify h* with h, then the roots become the following elements 
of h. The roots coming from (8.13) are 


i (Ekk + Eu 0 t(—-Exe-Eu 0 (8.16) 
2 0 Ekk — Eu)? 2 0 Exp + Eu)’ l 
the roots coming from (8.14) are 
i ( Erk- Eu 0 
2 ( 0 — —Ekk+ A ? (eee 


and the roots coming from (8.15) are 


(Ex, 0 .{ -Ekk 0 
(Ee 2), (7). 619) 


The roots in (8.16) and (8.17) have length 1 and the roots in (8.18) have 
length V2. 
As a base, we may take the n — 1 roots 


tf Exp — Ek+1,k+1 0 
fa ) 8.19 
2 ( 0 — Ekk + Ek+1,k+1 (9:13) 


together with the one additional root 


.[(Enm 0 
i( 0 ea) (8.20) 


The angle between two consecutive roots in (8.19) is 120°; nonconsecutive 
roots in (8.20) are orthogonal. The angle between the root in (8.20) and the 
last root in (8.19) is 135°; the root in (8.20) is orthogonal to the other roots 
in (8.19). 


8.9 Dynkin Diagrams and the Classification 


In this section, we discuss (without proof) the classification, up to equivalence, 
of root systems. This leads to a classification, up to equivalence, of semisimple 
Lie algebras. The classification of root systems is given in terms of an object 
called the Dynkin diagram. 

Suppose A = {aj,...,a,} is a base for a root system R. Then, the Dynkin 
diagram for R (relative to the base A) is a graph having vertices v1,..., Ur. 
Between any two vertices, we place either no edge, one edge, two edges, or 
three edges as follows. Consider distinct indices i and 7. If the corresponding 
roots a; and a; are orthogonal, then we put no edge between v; and vj. In 
the cases where a; and a; are not orthogonal, we put one edge between vı 
and vj if a; and a; have the same length, two edges if the longer of a; and 
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a; is V2 longer than the shorter, and three edges if the longer of a; and a; is 
V3 longer than the shorter. In addition, if a; and aj; are not orthogonal and 
not of the same length, then we decorate the edges between v; and v; with 
an arrow pointing from the vertex associated to the longer root toward the 
vertex associated to the shorter root. (Thinking of the arrow as a “greater 
than” sign makes it clear which way the arrow is supposed to go.) Proposition 
8.6 tells us that if a; and a; are not orthogonal, then the only possible length 
ratios are 1, /2, and V3. Furthermore, Propositions 8.6 and 8.11 tell us that 
these three cases correspond to angles of 120°, 135°, and 150°, respectively. 

Two Dynkin diagrams are said to be equivalent if there is a one-to-one, 
onto map of the vertices of one to the vertices of the other that preserves 
the number of bonds and the direction of the arrows. Recall (Theorem 8.20) 
that any two bases for the same root system can be mapped into one another 
by the action of the Weyl group. This implies that the equivalence class of 
the Dynkin diagram is independent of the choice of base. As we will see, not 
every graph arises as the Dynkin diagram of a root system, but only graphs 
of certain very special forms. 


Theorem 8.26. A root system is irreducible if and only if its Dynkin diagram 
is connected. 

Two root systems with equivalent Dynkin diagrams are equivalent. 

If RY is the dual root system to R, then the Dynkin diagram of RY is the 
same as that of R except that the direction of each arrow is reversed. 


So, the classification of irreducible root systems amounts to classifying all 
the connected diagrams that can arise as Dynkin diagrams of root systems. 

The calculations in the previous section and in Section 6.9 allow us to read 
off the Dynkin diagrams for the classical Lie algebras, sl(n;C), so(n; C), and 
sp(n; ©). 

An. The root system A, is the root system of sl(n + 1;C), which has rank 
n. The Dynkin diagram for A, is shown in Figure 8.11. 

B,. The root system B, is the root system of so(2n + 1;C), which has 
rank n. The Dynkin diagram for B,, is shown in Figure 8.12. 

Cn. The root system Cn is is the root system of sp(n;C), which has rank 
n. The Dynkin diagram for C,, is shown in Figure 8.13. 

Dn. The root system Dn is the root system of so(2n;C), which has rank 
n. The Dynkin diagram for D, is shown in Figure 8.14. 


Fig. 8.11. The Dynkin diagram for A, 


Certain special things happen in low rank. In rank one, there is only one 
possible Dynkin diagram, reflecting that there is only one isomorphism class 
of complex semisimple Lie algebras in rank one. The Lie algebra so(2; C) is 
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Fig. 8.12. The Dynkin diagram for Bn 


Fig. 8.13. The Dynkin diagram for Cn 


Fig. 8.14. The Dynkin diagram for D, 


not semisimple and the remaining three, sl(2;C), so(3;C), and sp(1;C), are 
isomorphic. In rank two, the Dynkin diagram Dz is disconnected, reflecting 
that so(4;C) = sl(2;C) @sl(2;C). Also, the Dynkin diagrams Bz and C2 are 
isomorphic, reflecting that so(5;C) = sp(2;C). In rank three, the Dynkin 
diagrams A3 and D3 are isomorphic, reflecting that sl(4;C) © so(6; C). 

From the calculations in the previous section, we may observe certain 
things about the short and long roots in root systems where more than one 
length of root occurs. The long roots in B, form a root system by themselves, 
namely Dn. The short roots in B, form a root system by themselves, namely 
A, x- X Aj. In Cy, it is the reverse: The long roots form A; x +: x A; and 
the short roots form D,. 

In addition to the root systems associated to the classical Lie algebras, 
there are five “exceptional” irreducible root systems, denoted G2, F4, Ee, 
E7, and Eg, whose Dynkin diagrams are shown in Figure 8.15. We have con- 
structed the root system G2 explicitly; for constructions of the other excep- 
tional root systems, see Section 12 of Humphreys (1972). 


(Eo) 


(E7) 


(Eg) 


F) = O—o>=0—o 


G) cos 
Fig. 8.15. The exceptional Dynkin diagrams 
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We are now ready to state (without proof) the classification theorem for 
irreducible root systems. 


Theorem 8.27. Every irreducible root system is isomorphic to precisely one 
root system from the following list: 


1. The classical root systems An, n > 1 
2. The classical root systems Bn, n > 2 
8. The classical root systems Cn, n > 3 
4. The classical root systems Dn, n > 4 
5. The exceptional root systems Go, F4, Eg, E7, and Eg 


The restrictions on the values of n are to avoid the low-rank repetitions 
discussed earlier. The classification of irreducible root systems leads to a clas- 
sification of all root systems. Every root system can be decomposed as a direct 
sum of irreducible root systems, and the decomposition is unique. Thus, gen- 
eral root systems are classified by listing which irreducible summands occur 
and how many times each on occurs. 

It turns out that the classification of semisimple Lie algebras is equivalent 
to the classification of root systems, as the following theorem explains. 


Theorem 8.28. 


1. If Ry and Ro are the root systems for two different Cartan subalgebras of 
the same complex semisimple Lie algebra, then Ry and Rg are isomorphic. 

2. A semisimple Lie algebra is simple if and only if its root system is irre- 
ducible. 

3. If two complex semisimple Lie algebras have isomorphic root systems, then 
the semisimple Lie algebras are isomorphic. 

4. Every root system arises as the root system of some complet semisimple 
Lie algebra. 


Point 4 of the theorem can be proved either by a general construction 
of semisimple Lie algebras, as in Section 18 of Humphreys (1972), or by a 
case-by-case analysis. It suffices to prove this for irreducible root systems and 
we already know the result for the root systems of type A, B, C, and D. 
So, it suffices to construct Lie algebras corresponding to the exceptional root 
systems, as, for example, in Jacobson (1962). 

Theorems 8.27 and 8.28 lead to the following classification of complex 
simple Lie algebras. 


Theorem 8.29. Every complex simple Lie algebra is isomorphic to precisely 
one algebra from the following list: 


1.sl(n+1;C),n>1 
2.so(2n+1;C),n>2 
3. sp(n;C), n> 3 
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4. $0(2n;C), n> 4 
5. The exceptional Lie algebras Go, F4, Eg, E7, and Ex 


A semisimple Lie algebra is determined up to isomorphism by specifying 
which simple summands occur and how many times each on occurs. 


8.10 The Root Lattice and the Weight Lattice 


If E is a finite-dimensional real vector space, then a subset A of E is called a 
lattice if there is some basis'v,,...,v, for E such that A is the space of all 
linear combinations of v1,...,v, with integer coefficients. The set of all integer 
linear combinations of roots is a lattice (called the root lattice) since the set 
of such linear combinations is the same as the set of linear combinations of the 
positive simple roots with integer coefficients, and the positive simple roots 
form a basis for E. The set of all integral elements is also a lattice, called 
the weight lattice, because it is the set of integer linear combinations of the 
fundamental weights. The weight lattice is precisely the set of elements that 
arise as weights of finite-dimensional representations of g. 

Now, we have observed that every root is an integral element, and, there- 
fore, a linear combination of roots with integer coefficients is also an integral 
element. This means that the root lattice is contained in the weight lattice. 
Are the two lattices equal? In general, no, not even for sl(2;C). In the sl(2;C), 
case we think of the weights as eigenvalues of the element H. These eigen- 
values are integers, and every integer occurs as an eigenvalue of H in one of 
the finite-dimensional representations Tm of sl(2;C) described in Chapter 4. 
Thus, the weight lattice is isomorphic to Z. Meanwhile, the eigenvalues of H 
in the adjoint representation are 0, 2, and —2. So, the roots correspond to 
the elements +2 inside Z and the root lattice corresponds to the set of even 
integers inside Z. 

For any root system, we may regard both the root lattice and the weight 
lattice as commutative subgroups of E under the operation of vector addition. 
The quotient group (weight lattice) /(root lattice) is then a finite commutative 
group. The following result is offered without proof. 


Theorem 8.30. Suppose K is a simply-connected compact Lie group with Lie 
algebra £. Let t be a maximal commutative subalgebra of € so that h = t+ it 
is a Cartan subalgebra of tc. Consider the root lattice and the weight lattice 
inside h*. Then, the center of K satisfies 


Z(K) & (weight lattice) /(root lattice). 


It is essential in this result that K be simply connected. Note that the 
weight lattice and the root lattice are purely Lie-algebraic constructions and, 
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thus, their quotient must be canonically associated to the Lie algebra. Al- 
though there can be several different connected compact groups (with noni- 
somorphic centers) having Lie algebra K, there is (up to isomorphism) only 
one simply-connected group. 

The way I have formulated Theorem 8.30 is “dual” to the usual formu- 
lation. For more information about this and for a result on the center of 
non-simply-connected compact groups, see Section E.4. 

Instead of considering the simply-connected group K, we can consider the 
adjoint group, Ad(K) := K/Z(K). (It is called this since the kernel of the 
adjoint representation is the center of K, which means that the adjoint group 
is isomorphic to the image of K under the adjoint representation.) If K is 
compact and simply connected, then Ad(K) is the unique (up to isomorphism) 
Lie group whose center is trivial and whose Lie algebra is isomorphic to the 
Lie algebra of K. Then, we have 


mı(Ad(K)) S (weight lattice) /(root lattice), 


where 7, denotes the fundamental group (Appendix E). This explains why 
the quotient of the weight lattice by the root lattice is called the “fundamental 
group” of a root system in Humphreys (1972). 

Another way to think about the relationship between the two lattices is as 
follows. The weight lattice is the set of possible weights of representations of 
the Lie algebra € or, equivalently, of the simply-connected group K. The root 
lattice is the set of possible weights of representations of the adjoint group 
Ad(K). In particular, the roots themselves are the weights of the adjoint 
representation, which may be thought of as a representation of Ad(K). 

We have already pointed out that in the case of sI(2; C), the root lattice 
inside the weight lattice may be thought of as the set of even integers inside 
the set of all integers. Thus, in this case, the quotient is isomorphic to Z/2. 
This reflects that the center of the compact simply-connected group SU(2) is 
{I, —I} = Z/2 and that the fundamental group of the adjoint group SO(3) S 
SU(2)/{Z, -—I} is Z/2. 

Suppose that a,...,@, form a base for a root system R in E, and suppose 
that 41,..., Hr are the associated fundamental weights (having the property 
that 2 (ux, ar) / (a1, 01) = Oe, k,l = 1,...,r). Then, H1,..., Hr form a basis 
for E as a vector space. Thus, every element of E has a unique expansion in 
terms of u41, ... , ur and the integral elements are precisely those elements of E 
for which the expansion coefficients are integers. Since each root is an integral 
element, we have, for each k = 1,...,7, &k = Nki Hi +°°* + Neri, With each 
Ne. being an integer. We can then form an r x r matrix whose entries are 
these integers. Then, the number of elements in (weight lattice) /(root lattice) 
is equal to the absolute value of the determinant of this matrix. 

To see that this is so, observe that the set aj +---+a,;Ur,0<ag <1, 
is a fundamental domain for the weight lattice; that is, every element of E 
can be written uniquely as the sum of an element of this set and an element 
of the weight lattice. Similarly, the set aja, + -+ + arar is a fundamental 
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domain for the root lattice. It is not hard to see that the number of elements 
in the quotient (weight lattice)/(root lattice) is the ratio of the volume of 
a fundamental domain in the root lattice to the volume of a fundamental 
domain in the weight lattice. This ratio of volumes is the absolute value of 
the determinant of the linear map that takes uz, to a, (k = 1,...,r). The 
matrix that represents this linear transformation in the basis 41,..., 4r is the 
matrix whose entries are ngi. 

In the case of Ag (the root system for sl(3;C)), the fundamental weights 
and positive simple roots are related by wi = 2a + laz and u2 = ta + 2an. 
Inverting this gives a] = 24) — u2 and ag = —pHı + 2u2, which can also be 
seen directly from Figure 5.2. Since 


2-1 
a(i >) = 3, 


we conclude that (weight lattice) /(root lattice) has three elements in this case. 
Figure 8.16 shows the root lattice (large triangles) and the weight lattice (small 
triangles) for Aj. The large triangles in Figure 8.16 have three times the area 
of the small triangles, reflecting that the quotient of the two lattices has three 
elements. (In each lattice, a triangle is half of a fundamental domain.) 
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Fig. 8.16. The root lattice and the weight lattice for A2 


The reader is invited to perform the analogous calculation for the remain- 
ing rank-two root systems and verify the results of the following table. 


root system # of elements 


Ay x Ay 4 
Ag 3 
Bo 2 
G2 1 


Note that for G2, each fundamental weight is a root, and thus in this 
case, the root lattice and weight lattice are equal. The same holds for the 
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exceptional root systems F4 and Eg. For all other irreducible root systems, 
the root lattice is a proper sublattice of the weight lattice (see Section 3.6 of 
Chapter X of Helgason (1978)). 

For the case of sl(n;C) (as in Section 6.9), the roots are the diagonal 
matrices having one diagonal entry equal to 1, one diagonal entry equal to 
—1, and the other diagonal entries equal to 0. The root lattice (integer linear 
combinations of the roots) is then the set of all diagonal matrices whose 
diagonal entries are integers that sum to zero. 

Meanwhile, the co-roots are the same as the roots under our identifications. 
The integral elements are then the trace-zero diagonal matrices whose inner 
product with each (co-)root is an integer. This means that an integral element 
is a trace-zero diagonal matrix in which the difference of any two diagonal 
entries is an integer. It is not hard to show that such matrices are precisely 
those of the form 


k k k 
wading (+ Eset En + 8), 
n n n 


where k is an integer and 1,,...,/, are integers satisfying l +-+- +l, = —k. 
Here, diag(-) denotes the diagonal matrix with the indicated diagonal entries. 
It is then not hard to show that the weight lattice modulo the root lattice is 
isomorphic to Z/n, reflecting that 


Z(SU(n)) = {e*I k € Z} = Z/n. 


8.11 Exercises 


1. If R is a root system in F, consider the function q : E — R given by 


aH) = [[ (o,#). 


aeRt 


Using results from Section 8.7, show that q satisfies 
qlw- H) = det(w)q(H) 


for all w € W and all H € E. 

2. Consider the diagrams of dominant integral elements for rank-two root 
systems in Section 8.5. In each case, verify that the roots labeled a; and 
ag form a base for the corresponding root system. 

3. For each of the rank-two root systems, verify that the number of Weyl 
chambers is equal to the order of the Weyl group. (Compare Theorem 
8.20.) 

4. Verify directly that any two bases for Bz can be mapped into one another 
by the action of the Weyl group. 


5. 


11. 


12. 
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For each of the irreducible rank-two root systems, verify that the Weyl 
group acts transitively on each length of root. (Compare Property 3 in 
Section 8.7.) 


. Consider the base {a1, a2} for By in Figure 8.6. Let pı and u2 denote the 


fundamental weights for B2, namely those satisfying 


for k,l € {1,2}. (The fundamental weights are circled in Figure 8.6.) Then, 
every dominant integral element u for B can be expressed in the form 
L = Mihi + M2, with mı and mo being non-negative integers. 

Using Theorem 7.43, compute the dimension of the irreducible represen- 
tation with highest weight m 1 + M22 explicitly as a function of mı 
and mz. 


. Prove Property 10 in Section 8.7 using Property 5. 


Hint: Suppose a is a positive simple root. Show that the positive roots 8 
with 8 Æ a and (a, 3) # 0 come in pairs {81, 82} with (a, 61) = — (a, Be). 


. Suppose 4, and u2 are dominant integral elements and that pi > po. 


Show, using the results of Section 8.7, that the convex hull of the W-orbit 
of u contains the convex hull of the W-orbit of u2. 


. Suppose that pu is an integral element. Using Exercise 8 and Theorem 


7.41, show that there are infinitely many inequivalent irreducible repre- 
sentations of g for which p is a weight. 


. Compute the root lattice and the weight lattice for Bz and verify that 


(weight lattice)/(root lattice) has two elements. 

For which rank-two root systems is —J an element of the Weyl group? 
(Compare Section 7.6.1.) 

Verify the multiplicities of the irreducible representation of sl(3;C) with 
highest weight (1,2) given in Section 5.7. Use Kostant’s formula (Section 
7.6) together with the invariance of the weights and multiplicities under 
the action of the Weyl group. 


A 


A Quick Introduction to Groups 


For the reader who may not have had course in abstract group theory, I provide 
a quick review here. Only the definitions and basic properties of groups are 
needed in reading this book. 


A.1 Definition of a Group and Basic Properties 


Definition A.1. A group is a set G, together with a map of G x G into G 
(denoted g x h) with the following properties: 
First, associativity: For all g,h,k € G, 


g*(h*k) = (g*h) *k. (A.1) 
Second, there exists an element e in G such that for allg E€ G, 
g*xe=exg=g (A.2) 
and such that for all g € G, there exists h € G with 
g*h=hxg=e. (A.3) 


Ifgxh=hxg for allg,h € G, then the group is said to be commutative 
(or abelian). 


The element e is (as we shall see shortly) unique and is called the identity 
element of the group or, simply, the identity. Part of the definition of a group 
is that multiplying a group element g by the identity on either the right or 
the left must give back g. 

The map of G x G into G is called the product operation for the group. 
Part of the definition of a group G is that the product operation map G x G 
into G (i.e., that the product of two elements of G be again an element of G). 
This may not always be obvious in examples. This property is referred to as 
closure. 
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Given a group element g, a group element h such that gxh =h*g=e 
is called an inverse of g. We shall see momentarily that each group element 
has a unique inverse. 

Given a set and an operation, there are four things that must be checked 
to show that this is a group: closure, associativity, existence of an identity, 
and existence of inverses. 


Proposition A.2 (Uniqueness of the Identity). Let G be a group and let 
e, f E G be such that for allg € G, 


exg=g*e=g9, 
f*g=g*f =g. 
Then, e = f. 
Proof. Since e is an identity, we have 
exf=f. 
On the other hand, since f is an identity, we have 
ex f=e. 
Thus,e=ex f =f. QO 


Proposition A.3 (Uniqueness of Inverses). Let G be a group, e the 
(unique) identity element of G, and g,h,k arbitrary elements of G. Suppose 
that 


g*h=heg=e, 
gxk=k*g=e. 


Then, h = k. 

Proof. We know that g * h = g * k (= e). Multiplying on the left by h gives 
h*(gxh)=hx(gxk). 

By associativity, this gives 
(hxg)*h= (hx g)*k, 

and, so, 


exh=exk, 
h=k. 


This is what we wanted to prove. Oo 
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Proposition A.4. Let G be a group, e the identity element of G, and g an 
arbitrary element of G. Suppose h € G satisfies either hxg =e org*xh=e. 
Then, h is the (unique) inverse of g and, thus, bothh*g =e andgxh=e 
hold. 


Proof. To show that h is the inverse of g, we must show both that hxg =e 
and g*h = e. Suppose we know, say, that h»g = e. Then, our goal is to show 
that this implies that g * h = e. 
Since h x g =e, 
g*(hxg)=g*e=g. 
By associativity, we have 
(g*h)*g =g. 
Now, by the definition of a group, g has an inverse. Let k be that inverse. (Of 
course, in the end, we will conclude that k = h, but we cannot assume that 
now.) Multiplying on the right by k and using associativity again gives 


((g*h)xg)xk=g*k=e, 
(g * h) x (g * k) = e, 


(g*h)*e=e, 
g*eh=e. 
A similar argument shows that if g x h = e, then h x g =e. o 


Note that in order to show that h x g = e implies g * h = e, we used the 
fact that g has an inverse, since it is an element of a group. In more general 
contexts (i.e., in some systems that are not groups), one may have h*g =e 
but not g *h = e. (See Exercise 10.) 


Notation A.5 For any group element g, its unique inverse will be denoted 
ee 

Proposition A.6 (Properties of Inverses). Let G be a group, e its iden- 
tity, and g,h arbitrary elements of G. Then, 


apai 
(7) =g, 
(gh) = hgt, 
el =e. 
Proof. Exercise 3. ; o 


A.2 Examples of Groups 


From now on, we will denote the product of two group elements g and h 
simply by gh, instead of the more cumbersome g * h. Moreover, since we have 
associativity, we will omit parentheses and write ghk in place of (gh)k or 


g(hk). 
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A.2.1 The trivial group 


The set with one element, e, is a group, with the group operation being defined 
as ee = e. This group is commutative. 

Associativity is automatic, since both sides of (A.1) must be equal to e. 
Of course, e itself is the identity and is its own inverse. Commutativity is also 
automatic. 


A.2.2 The integers 


The set Z of integers forms a group with the product operation being addition. 
This group is commutative. 

First, we check closure, namely that addition maps Z x Z into Z (i.e., that 
the sum of two integers is an integer). Since this is obvious, it remains only 
to check associativity, identity, and inverses. Addition is associative; zero is 
the additive identity (ie.,0+n = n+0 = n, for all n € Z); each integer 
n has an additive inverse, namely —n. Since addition is commutative, Z is a 
commutative group. 


A.2.3 The reals and R” 


The set R of real numbers also forms a group under the operation of addition. 
This group is commutative. Similarly, the n-dimensional Euclidean space R” 
forms a group under the operation of vector addition. This group is also 
commutative. 

The verification is the same as for the integers. 


A.2.4 Nonzero real numbers under multiplication 


The set of nonzero real numbers forms a group with respect to the operation 
of multiplication. This group is commutative. 

Again, we check closure: The product of two nonzero real numbers is a 
nonzero real number. Multiplication is associative; the number 1 is the mul- 
tiplicative identity; each nonzero real number zx has a multiplicative inverse, 
namely H, Since multiplication of real numbers is commutative, this is a com- 
mutative group. i 

This group is denoted R*. 


A.2.5 Nonzero complex numbers under multiplication 


The set of nonzero complex numbers forms a group with respect to the op- 
eration of complex multiplication. This group is commutative and is denoted 
C*, 
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A.2.6 Complex numbers of absolute value 1 under multiplication 


The set of complex numbers with absolute value 1 (i.e., of the form et?) forms 
a group under complex multiplication. This group is commutative. 
This group is the unit circle, denoted S?. 


A.2.7 The general linear groups 


For each positive integer n, the set of all n x n invertible matrices with real 
entries forms a group with respect to the operation of matrix multiplication. 

We check closure: The product of two invertible matrices A and B is 
invertible, since (AB)~' = B-!A7!. Matrix multiplication is associative; the 
identity matrix (with ones on the diagonal and zeros elsewhere) is the identity 
element; by definition, an invertible matrix has an inverse. Simple examples 
show that the group is noncommutative, except in the trivial case n = 1. (See 
Exercise 8.) 

This group is called the general linear group (over the reals) and is 
denoted GL(n; R). 

In the same way, we define the general linear group over the complex 
numbers, denoted GL(n; C). 


A.2.8 Symmetric group (permutation group) 


The set of one-to-one, onto maps of the set {1,2,...,n} to itself forms a group 
under the operation of composition. 

We check closure: The composition of two one-to-one, onto maps is again 
one-to-one and onto. Composition of functions is associative; the identity map 
(which sends 1 to 1, 2 to 2, etc.) is the identity element; a one-to-one, onto 
map has an inverse. Simple examples show that the group is noncommutative 
for n > 3. (See Exercise 9.) 

This group is called the symmetric group and is denoted S,. A one-to- 
one, onto map of {1,2,...,n} is a permutation, and, so, Sn is also called the 
permutation group. The group S, has n! elements. 


A.2.9 Integers mod n 


The set {0,1,...,n — 1} forms a group under the operation of addition modulo 
n, where n is a positive. This group is commutative. 

Explicitly, the group operation is the following. Consider a,b in the set 
{0,1,...,n— 1}. Ifa+b6 < n, then we define a +b mod n to be a+ b. If 
a+b > n, then we define a +b mod n to be a+ b—n. (Since a and b are less 
than n, a+b- n is less than n; thus, we have closure.) To show associativity, 
note that both 

(a+b mod n)+cmodn 
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and 
a+(b+c mod n) mod n 


are equal to a+b+c, minus some multiple of n, and hence differ by a multiple 
of n. However, since both are in the set {0,1,...,n—1}, the only possible 
multiple on n is zero. Zero is still the identity for addition modulo n. The 
inverse of an element a in {0,1,...,n — 1} is n—a. The group is commutative 
because ordinary addition is commutative. 

This group is referred to as “Z mod n” and is denoted Z/n. 


A.3 Subgroups, the Center, and Direct Products 


Definition A.7. A subgroup of a group G is a subset H of G with the fol- 
lowing properties: 


1. The identity is an element of H. 
2. Ifhe H, then h`! eH. 
3. If hi, ho € H, then hiho € H . 


The conditions on H guarantee that H is a group, with the same product 
operation as G (but restricted to H). Closure is assured by Condition 3, 
associativity follows from associativity in G, and the existence of an identity 
and of inverses is assured by Conditions 1 and 2 (together with the existence of 
an identity and inverses in G). If H is a nonempty subset of G, then Conditions 
2 and 3 imply Condition 1. 

Every group G has at least two subgroups: G itself and the one-element 
subgroup {e}. (If G itself is the trivial group, then these two subgroups coin- 
cide.) These are called the trivial subgroups of G. 

The set of even integers is a subgroup of Z: Zero is even, the negative of 
an even integer is even, and the sum of two even integers is even. 

The set H of n x n real matrices with determinant one is a subgroup of 
GL(n; R). The set H is a subset of GL(n; R) because any matrix with determi- 
nant one is invertible. The identity matrix has determinant one, so Condition 
1 is satisfied. The determinant of the inverse is the reciprocal of the deter- 
minant, so Condition 2 is satisfied; and the determinant of a product is the 
product of the determinants, so Condition 3 is satisfied. This group is called 
the special linear group (over the reals) and is denoted SL(n; R). 

Additional examples, as well as some nonexamples, are given in Exercise 
2. 


Definition A.8. The center of a group G is the set of all g E G such that 
gh =hg for allhEeG. 


It is not hard to see that the center of any group G is a subgroup G. 
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Definition A.9. Let G and H be groups and consider the Cartesian product 
of G and H (i.e., the set of ordered pairs (g,h) with g € G,h € H). Define a 
product operation on this set as follows: 


(g1, h1)(92, h2) = (9192, hıh). 


This operation makes the Cartesian product of G and H into a group, called 
the direct product of G and H and denoted G x H. 


It is a simple matter to check that this operation truly makes G x H into 
a group. For example, the identity element of G x H is the pair (e1, e2), where 
e, is the identity for G and e3 is the identity for H. 


A.4 Homomorphisms and Isomorphisms 


Definition A.10. Let G and H be groups. A map ® : G > H is called a 
homomorphism if ®(gh) = ®(g)®(h) for all g,h € G. If, in addition, ® is 
one-to-one and onto, then ® is called an isomorphism. An isomorphism of 
a group with itself is called an automorphism. 


Proposition A.11. Let G and H be groups, e, the identity element of G, 
and ez the identity element of H. If ®: G > H is a homomorphism, then 
(e1) = e2 and ®(g~!) = (g)! for allg EG. 


Proof. Let g be any element of G. Then, ®(g) = ®(ge1) = (g) (e1). Mul- 
tiplying on the left by ®(g)~! gives e2 = (e1). Now, consider ®(g~'). Since 
(e1) = e2, we have eg = ®(e,) = (gg!) = &(g)®(g—!). From Proposition 
A.4, we conclude that ©(g~+) is the inverse of ®(g). o 


Definition A.12. Let G and H be groups, ® : G + H a homomorphism, and 
ez the identity element of H. The kernel of ® is the set of all g € G for which 
®(g) = ez. 


Proposition A.13. Let G and H be groups and ® : G —> H a homomor- 
phism. Then, the kernel of ® is a subgroup of G. 


Proof. Easy. o 


Actually, the kernel ® will be a normal subgroup of G. See Section A.5. 

Given any two groups G and H, we have the trivial homomorphism from 
G to H: ®(g) =e for all g € G. The kernel of this homomorphism is all of G. 

In any group G, the identity map (which maps every element g to itself) 
is an automorphism of G, whose kernel is just {e}. 

Let G = H = Z and define (n) = 2n. This is a homomorphism of Z 
to itself, but not an automorphism. The kernel of this homomorphism is just 


{0}. 
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The determinant is a homomorphism of GL(n, R) to R*. The kernel of this 
map is SL (n, R). 

Additional examples are given in Exercises 7 and 12. 

If there exists an isomorphism from G to H, then G and H are said to be 
isomorphic, and this relationship is denoted G & H. (See Exercise 4.) Two 
groups which are isomorphic should be thought of as being (for all practical 
purposes) the same group. 


A.5 Quotient Groups 


Definition A.14. Let G be a group and N a subgroup of G. Then, N is called 
anormal subgroup of G if for each element g of G and each element n of 
N, the element gng~' belongs to N. 


Note that if G is commutative, then every subgroup of G is automatically 
normal, since, in the commutative case, gng~! =n. 


The motivation for this definition comes from the following result. 


Proposition A.15. If G and H are groups and ®: G > H is a homomor- 
phism, then ker ® is a normal subgroup of G. 


Proof. Let e2 denote the identity element of H. Suppose that g is an element 
of G and n is an element of ker  (i-e., that &(n) = e2). Then, we compute 
that 


This shows that gng~' is, again, an element of ker ® and, thus, that ker ® is 


a normal subgroup of G. o 


Let G be a group and N a subgroup of G (for the moment, not assumed 
normal). Then, define an equivalence relation on G by defining two elements 
g and h to be equivalent if gh! is an element of N. Let us see that this is 
indeed an equivalence relation (i.e., that this notion of equivalence is reflexive, 
symmetric, and transitive). For any g, gg71 = e and e is an element of N. This 
shows that every element of G is equivalent to itself. If gh! is an element 
of N, then (gh~1)~! = hg? is also an element of N. This shows that if g is 
equivalent to h, then h is also equivalent to g. Finally, if gh~1 is an element 
of N and hk~' is an element of N, then gh~1hk~! = gk! is an element of 
N. This shows that if g is equivalent to h and h is equivalent to k, then g is 
equivalent to k. 


Proposition A.16. Suppose that G is a group and N is a normal subgroup 
of G. Define two elements g and h of G to be equivalent if gh™} € N. 


A.5 Quotient Groups 287 


1. If gı is equivalent to g> and hı is equivalent to hg, then gıhı is equivalent 
to gah. 
2. If gi is equivalent to go, then g7 l is equivalent to 92 i 


In this proposition, it is essential that N be a normal subgroup of G. This 
proposition says that the equivalence relation “respects” the group operations 
of multiplication and inversion. 


Proof. Assume that gı is equivalent to g2 and that hı is equivalent to he. 
We want to show that gıhı is equivalent to g2h2. We first show that gıhı is 
equivalent to gih by computing 


gihi(gih2)* = ghihz'gy'. 


Now, we note that hikz l is an element of N, since hı and hz are assumed 
equivalent. Then, because N is normal, we have that gi(hihz')g)' is also an 
element of N and, therefore, gıhı is equivalent to gıh2. Next, we show that 
gihe is equivalent to g2hə by computing that 


gih2(g2he)~* = gihehy'g5' = gigy' € N. 


So, gıhı is equivalent to gih2, which is equivalent to gəh. This implies (as 
we have shown earlier) that gh; is equivalent to gohe. 
Meanwhile, if gı is equivalent to g2, then 


gy (93) = 97 G2 = 93 1929; "92. 


However, since gz is equivalent to 91, gog; l is an element of N, and then 
because N is normal, g3 '(g297 ')g2 is also an element of N. o 


Now, if g is any element of G, let [g] denote the equivalence class containing 
G; that is, [g] is the subset of G consisting of all elements equivalent to g 
(including g itself). Since our equivalence relation is an equivalence (reflexive, 
symmetric, and transitive), if gı is equivalent to go, then the equivalence class 
[g1] is the same as the equivalence class [g2]. Every element of G belongs to 
precisely one equivalence class. 


Definition A.17. Let G be a group and N a normal subgroup of G. The 
quotient group G/N is the set of all equivalence classes in G, with. the 
product operation defined by 


[g][h] = [għ]. 


Let us see if we understand what this means. The elements of G/N are 
equivalence classes, and the group product is defined by choosing one element 
g out of the first equivalence class, choosing one element h out of the second 
equivalence class, and then defining the product to be the equivalence class 
containing gh. We need to check that the group product is well defined (i.e., 
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that the product does not depend on the choice of elements out of each equiv- 
alence class). However, this is precisely where the normality of N comes in. If 
we pick a different element g’ from the first equivalence class and a different 
element h’ out of the second, then by the proposition, g'h’ will be equivalent to 
gh. This means that the equivalence class [gh] is the same as the equivalence 
class [g'h’], which shows that our product really is well defined. 

The idea behind the quotient group construction is that we make a new 
group out of G by setting every element of N equal to the identity. This then 
forces ng to be equal to g for any g € G and n € N. However, ng and n 
are equivalent, since (ng)g-! = n € N. So, setting elements of N equal to 
the identity forces elements that are equivalent (in the above sense) to be 
equal. The condition that N be a normal subgroup guarantees that we still 
have defined group operations after setting equivalent elements equal to each 
other. 

The simplest example of a quotient group is the group of integers modulo 
n. In this case, we take G = Z and N = nZ (the set of integer multiples of n). 
It is easy to check that N is a subgroup of Z, and since Z is commutative, all 
subgroups are normal. To form the quotient group, we say that two elements 
of Z are equivalent if their difference is in N. (We use additive notation for 
the group operation in Z and, so, the quotient gh~! becomes i — j.) Thus, 
the equivalence class of an integer i is the set of all integers that are equal 
modulo n to i. The operation of addition makes the set of equivalence classes 
into a group and this group is nothing but the group of integers modulo n, as 
described in Section A.2. (In Section A.2, we label equivalence classes modulo 
n by picking the unique element of the equivalence class that is between 0 and 
n—1.) 

Another example is obtained by taking G = SL(n;C) and taking N to 
be the set of elements of SL(n;C) that are multiples of the identity. The 
elements of N are the matrices of the form e27**/"I, k =0,1,...,n—1. This 
is a normal subgroup of SL(n;C) because each element of N is a multiple 
of the identity, and, thus, for any A € SL(n;C), we have A(e27**/"J)A-! = 
AA71(e27*k/n7) = e?"ik/n], The quotient group SL(n;C)/N is customarily 
denoted PSL(n; C), where the P stands for “projective.” It can be shown that 
PSL(n;C) is a simple group for all n > 2; that is, PSL(n;C) has no normal 
subgroups other than {I} and PSL(n;C) itself. 

If G is a group and N a normal subgroup, then there is a homomorphism 
q of G into the quotient group G/N given by 


It follows from the definition of the product operation on G/N that q is indeed 
a homomorphism and, clearly, q maps G onto G/N. More generally, suppose 
that G and H are groups and that ® : G > H is a homomorphism. We have 
observed that the kernel of ® is a normal subgroup of G. If ® maps G onto 
H, then it can be shown that H is isomorphic to the quotient group G'/ ker ®. 
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A.6 Exercises 


Recall the definitions of the groups GL(n; R), Sn, R*, and Z/n from Section 
A.2, and the definition of the group SL(n;R) from Section A.3. 


1. 
2. 


Show that the center of any group G is a subgroup G. 

In (a)-(f), you are given a group G and a subset H of G. In each case, 
determine whether H is a subgroup of G. 

(a) G=Z, H = {odd integers} 

(b) G=Z, H = {multiples of 3} 

(c) G = GL(n; R), H = {A € GL(n;R) |det A is an integer } 

(d) G = SL(n; R), H = {A € SL(n;R) |all entries of A are integers } 
Hint: Recall the formula for A~! in terms of cofactors of A. 

(e) G=GL(n;R), H = {A € GL(n;R) [all entries of A are rational } 

(f) G=Z/9, H = {0,2, 4,6, 8} 


. Verify the properties of inverses in Proposition A.6. 
. Let G and H be groups. Suppose there exists an isomorphism ¢ from G 


to H. Show that there exists an isomorphism from H to G. 


. Show that the set of positive real numbers is a subgroup of R*. Show that 


this group is isomorphic to the group R. 


. Show that the set of automorphisms of any group G is itself a group, under 


the operation of composition. This group is the automorphism group 


of G, Aut(G). 


. Given any group G and any element g in G, define ¢, : G —> G by 


g(h) = ghg—'. Show that ¢, is an automorphism of G. Show that the 
map g — ¢, is a homomorphism of G into Aut(G) and that the kernel of 
this map is the center of G. 

Note: An automorphism which can be expressed as ¢, for some g € G is 
called an inner automorphism; any automorphism of G which is not 
equal to any ¢, is called an outer automorphism. 


. Give an example of two 2 x 2 invertible real matrices which do not com- 


mute. (This shows that GL(2, R) is not commutative.) 


. An element ø of the permutation group Sn can be written in a two-row 


form: 
c= i 
O71 O2 tt On 


where c; denotes o (i). Thus, 


is the element of S3 which sends 1 to 2, 2 to 3, and 3 to 1. When multiplying 
(i.e., composing) two permutations, one performs the one on the right first 
and then the one on the left. (This is the usual convention for composing 
functions.) 

Compute 
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10. 


11. 


12. 
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123 123 
213 132 
123 123 
132 213 


Conclude that S3 is not commutative. 

Consider the set N= {0,1,2,...} of natural numbers and the set F of all 
functions of N to itself. Composition of functions defines a map of F x F 
into F, which is associative. The identity (id(n) = n) has the property 
that ido f = f oid = f, for all f in F. However, since we do not restrict 
to functions which are one-to-one and onto, not every element of F has 
an inverse. Thus, F is not a group. 

Give an example of two functions f and g in F such that fog = id, but 
go f Żid. (Compare with Proposition A.4.) 

Consider the groups Z and Z/n. For each a in Z, define a mod n to be 
the unique element b of {0,1,...,n — 1} such that a can be written as 
a = kn + b, with k an integer. Show that the map a —> a mod n is a 
homomorphism of Z into Z/n. 

Show that the center of any group G is a normal subgroup of G. 


and 


B 


Linear Algebra Review 


In this appendix, I collect together results from linear algebra that are used in 
the text. Only the simplest proofs are given here. The results quoted here are 
mostly standard, except for the SN decomposition, which is often skipped over 
on the way to the Jordan canonical form, and the discussion of weight spaces. 
For more information, the reader is encouraged to consult such standard linear 
algebra textbooks as Hoffman and Kunze (1971) or Axler (1997). 


B.1 Eigenvectors, Eigenvalues, and the Characteristic 
Polynomial 


If A is any matrix in M,(C), then a nonzero vector v in C” is called an 
eigenvector for A if there is some complex number A such that 


Av = Xv. 


An eigenvalue for A is a complex number A for which there exists a nonzero 
v € C” with Av = dv. So, A is an eigenvalue for A if the equation Av = Av 
or, equivalently, the equation 


(A—Al)u =0, 


has a nonzero solution v. This happens precisely when A — AI fails to be 
invertible, which is precisely when det(A — AJ) = 0. 
For any A € M,,(C), we define the characteristic polynomial p of A to 
be given by 
p(A) = det(A — AI), AEC. 


This is a polynomial of degree n. In light of the above discussion, the eigen- 
values are precisely the zeros of the characteristic polynomial. 

More generally, we may consider vector spaces. A vector space is a set 
V together with two operations: one called vector addition that takes two 
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elements of V and produces another element of V, and one called scalar mul- 
tiplication that takes a complex number and an element of V and produces 
another element of V. To be a vector space, these two operations should have 
the same algebraic properties as vector addition and scalar multiplication in 
C” (e.g., that vector addition is commutative and associative). There is a 
notion of the dimension of a vector space V, which may be infinite. 

A linear operator on a vector space V (also called a linear transforma- 
tion) is a map A of V to itself that satisfies 


A(ut+v) = Au + Av, 
A(\u) = AAu 


for all u and v.in V and all À in C. If A is an n x n matrix, then the map 
that sends a vector u in C” to the vector Au (defined as the matrix product 
of the n x n matrix A and the n x 1 matrix u) is a linear operator, and every 
linear operator on C” arises in this way for some unique matrix A. So, we will 
interchangeably regard A as an n x n matrix or as a linear transformation of 
Ce 

For any linear operator, we may define the notion of eigenvector and eigen- 
value in precisely the same way as for C”. If A is a linear operator on a finite- 
dimensional space, then the theory of eigenvectors and eigenvalues for A is 
the same as for matrices. If A is a linear operator on an infinite-dimensional 
space, then A may not have any eigenvectors. 

If A is a linear operator on a vector space V and J is an eigenvalue for A, 
then the \-eigenspace for A, denoted V\, is the set of all vectors v € V (in- 
cluding the zero vector) that satisfy Av = Av. The A-eigenspace for A is a sub- 
space of V. The dimension of this space is called the multiplicity of A. (More 
precisely, this is the “geometric multiplicity” of A. In the finite-dimensional 
case, there is also a notion of the “algebraic multiplicity” of A, which is the 
number of times that À occurs as a root of the characteristic polynomial. The 
geometric multiplicity of À cannot exceed the algebraic multiplicity.) 


Proposition B.1. Suppose that A is a linear operator on a vector space V 
and v1,...,Uķ are eigenvectors with distinct eigenvalues 1,...,An. Then, 
v1,..., Uk are linearly independent. 


Note that here V does not have to be finite dimensional. 


Proposition B.2. Every linear operator A on a finite-dimensional complex 
vector space has at least one eigenvector. 


This follows from the fundamental theorem of algebra, which says that 
every nonconstant polynomial with complex entries has at least one complex 
root. 
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B.2 Diagonalization 


Two matrices A, B € M,(C) are said to be similar if there exists an invertible 
matrix C such that 
A=CBC", 


in which case, B = C~!AC. The operation B + CBC™! is called conjuga- 
tion of B by C. A matrix is said to be diagonalizable if it is similar to a 
diagonal matrix. A matrix A € M,(C) is diagonalizable if and only if there 
exist n linearly independent eigenvectors for A. Specifically, if vj,...,un are 
linearly independent eigenvectors, let C be the matrix whose kt? column is 
vk. Then, C is invertible and we will have 


Àr 
A=C GA G 
Àn 


where \1,...,An are the eigenvalues associated to the eigenvectors 11,..., Un, 
in that order. 

If A € M,(C) has n distinct eigenvalues (i.e., n distinct roots to the char- 
acteristic polynomial), then A is automatically diagonalizable, by Proposition 
B.1. If the characteristic polynomial of A has repeated roots, then A may or 
may not be diagonalizable. 

Recall that for A € Mn (C), the adjoint of A, denoted A*, is the conjugate- 
transpose of A, 

(A*)kı = Aik. 


A matrix A is said to be self-adjoint (or Hermitian) if A* = A. A matrix A 
is said to be skew self-adjoint (or skew Hermitian or just skew) if A* = 
~A. A matrix is said to be unitary if A* = A7!. If A is self-adjoint, skew 
self-adjoint, or unitary, then A is automatically diagonalizable. Furthermore, 
in these cases, it is possible to find an orthonormal basis of eigenvectors for 
A, which means that the matrix C in the definition of diagonalizability may 
be taken to be unitary. 

If A is self-adjoint, then all of its eigenvalues are real. If A is real and 
self-adjoint (or, equivalently, real and symmetric), then the eigenvectors may 
be taken to be real as well, which means that in this case, the matrix C may 
be taken to be orthogonal. If A is skew, then its eigenvalues are imaginary. 
If A is unitary, then its eigenvalues are complex numbers of absolute value 1 
(i.e., of the form \ = et, with 0 € R). 

A matrix A is said to be normal if A commutes with its adjoint (i.e., if 
AA* = A*A). If A is self-adjoint, skew, or unitary, then it is normal (since 
in those cases, A* is A or —A or A™}, all of which commute with A). A 
normal matrix is automatically diagonalizable and has an orthonormal basis 
of eigenvectors. We summarize the results of the previous paragraphs in the 
following. 
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Theorem B.3. Suppose that A E€ M,,(C) has the property that A*A = AA*, 
(e.g., if A* = A, A* = A`}, or A* = —A). Then, A is diagonalizable and it 
is possible to find an orthonormal basis for C” consisting of eigenvectors for 
A. 

If A is real and symmetric, then all of the eigenvalues of A are real and it 
is possible to choose an orthonormal basis of eigenvectors for A in which each 
eigenvector is real. 


Proposition B.4. Suppose that A and B are linear operators on a finite- 
dimensional vector space V and suppose that AB = BA. Then, B maps the 
A-eigenspace of A into itself, for each eigenvalue à of A. 


Proof. Let À be an eigenvalue of A and let V) be the A-eigenspace of A. Then, 
let v be an element of V) and consider Bv. Since B commutes with A, we 
have 

A(Bv) = BAv = Bu; 


that is, applying A to Bv gives us back à times the vector we started with, 
and, so, Bv is, again, an element of Vj. o 


B.3 Generalized Eigenvectors and the SN Decomposition 


Not all matrices are diagonalizable. For example, consider the matrix 


cy 


The characteristic polynomial of this is p(A) = (A — 1)?, so the only eigenvalue 
of A is À = 1. Solving the equation Av = v gives 


where c is an arbitrary constant. This means that we cannot find two linearly 
independent eigenvectors for A. 

If A does not have enough linearly independent eigenvectors to be di- 
agonalizable, then we may consider the more general concept of generalized 
eigenvectors. A nonzero vector v € C” is called a generalized eigenvector 
for A if there is some complex number A and some positive integer k such that 


(A—AI)Fu =0. 


This can happen only if (A—AJ) is noninvertible. This means that the number 
Aà must be an (ordinary) eigenvalue for A. However, given an eigenvalue À, 
there may be generalized eigenvectors v that are not ordinary eigenvectors (in 
addition to at least one ordinary eigenvector). For example, if A is the 2 x 2 
matrix given earlier, then we may check that 
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azur (0)- (8) 


so that (0,1) is a generalized eigenvector for A (but not an ordinary eigenvec- 
tor). This is in addition to (1,0), which is an ordinary eigenvector. 

Given any A € M,(C), it is possible to find a basis v1,..., Un for C” such 
that each vx is a generalized eigenvector for A. This can be proved fairly easily 
by induction on n. Given a matrix A and an eigenvalue À, let V} be defined 
by 

Vy = {ve Œ |(A — AI)Fv = 0 for some k}; 


that is, V, is the space of all generalized eigenvectors with eigenvalue A, to- 
gether with the zero vector, which, by definition, is not a generalized eigen- 
vector but which satisfies (A — \J)*v = 0. For any À, V is a subspace of C”; 
that is, any linear combination of elements of Vy is, again, in V). It can be 
shown that C” decomposes as a direct sum of the V)’s, as \ ranges over all the 
eigenvalues of A. This means that if \,,..., Ax denote the distinct eigenvalues 
for A (with k < n), then every vector v in C” can be written uniquely as 


U = V1 + VQ +--+ + UE, 


with each v; in V),. In particular, this means that every matrix has a basis 
of generalized eigenvectors. 

Now, if v is in Vy, then Av is also in Vy, since (A—AI)* Av = A(A—AI)*u = 
0. This means that the subspace Vy is invariant under the matrix A. Let Ay 
denote the restriction of A to the subspace Vy, and write Ay in the form 


Ay = AI+ Ny 


(i.e., we define N) to be A, — AI). Then, Ny is nilpotent; that is, NE = 
for some positive integer k. We summarize the preceding discussion in the 
following theorem. 


Theorem B.5. Let A be an n x n complex matriz. Then, there exists a basis 
for C” consisting of generalized eigenvectors for A. Furthermore, C” is the 
direct sum of the generalized eigenspaces Vy, each Vy is invariant under A, and 
the restriction of A to each Vy is of the form AI + Ny, where Ny is nilpotent. 


Theorem B.6. Let A be ann xn complex matriz. Then, there exists a unique 
pair (S,N) of matrices with the following properties: (1) A = S +N, (2) 
SN = NS, (3) S is diagonalizable, and (4) N is nilpotent. 


The expression A = S + N, with S and N as in the theorem, is called 
the SN decomposition of A. The existence of an SN decomposition follows 
from the previous theorem: We define S to be the operator equal to AI on 
each generalized eigenspace of A and we set N to be the operator equal to Ny 
on each generalized eigenspace. For example, if A is the 2 x 2 matrix defined 
earlier, then we have 
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s=(01)- ¥=(00): 


One useful result that follows fairly easily from the SN decomposition is 
the following. 


Theorem B.7. Every matriz is similar to an upper triangular matriz. Every 
nilpotent matrix is similar to an upper triangular matrix with zeros on the 
diagonal. 


B.4 The Jordan Canonical Form 


The Jordan canonical form may be viewed as a refinement of the SN decom- 
position, based on a further analysis of the nilpotent matrices N) in Theorem 
B.5. Although the SN decomposition is sufficient for the purposes of this book, 
I discuss the Jordan canonical form here simply because it is more commonly 
taught in linear algebra courses. (The Jordan canonical form could be useful 
for a few of the exercises in Chapter 2.) To get to the Jordan form from the 
SN decomposition, one needs to be able to classify nilpotent matrices up to 
similarity. 


Theorem B.8. Every A € M,(C) is similar to a block-diagonal matrix in 
which each block is of the form 


àl 
ee 
ee 
À 


Two matrices A and B are similar if and only if they have precisely the same 
Jordan blocks, up to reordering. 


There may be several different Jordan blocks (possibly of different sizes) 
for the same value of À. In the case in which A is diagonalizable, each block 
is 1 x 1, in which case, the 1’s above the diagonal do not appear. Note that 
each Jordan block is, in particular, of the form AJ + N, where N is nilpotent. 


B.5 The Trace 


If A is an n xn matrix, we define the trace of A to be the sum of the diagonal 
entries of A; that is, 


trace(A) = a Akr- 
k=1 
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Note that the trace is a linear function of A (unlike the determinant). 
If A and B are two n X n matrices, then 


n 


trace(AB) = 5 /(AB > se Axi Bik- (B.1) 


k=1 k=1 l=1 


Similarly, 
trace(BA) = 5 X Bri Aix = 5 XO AirBr, (B.2) 
k=1 l=1 


which is just the same sum as (B.1) with the labels for the summation variables 
reversed. Thus, we conclude that trace(AB) = trace(BA). Then, if C is an 
invertible matrix and we apply this to the matrices CA and C~', we have 


trace(C ACT!) = trace(C~'C'A) = trace(A); 


that is, the trace is invariant under conjugation, or, similar matrices have the 
same trace. 

More generally, if A is a linear operator on a finite-dimensional vector 
space V, we can define the trace of A by picking a basis and then defining the 
trace of A to be the trace of the matrix that represents A in that basis. The 
above calculations show that the value of the trace of A is independent of the 
choice of basis. 


B.6 Inner Products 


Let (-,-) denote the standard inner product on C”, defined by 


where we follow the convention of putting the complex-conjugate on the first 
factor. If A is any matrix in M, (C), then the adjoint A* of A has the property 
that 

(u, Av) = (A*u, v) (B.3) 


for all u,v € C”. 

If V is any vector space over C, then an inner product on V is a map 
that associates to any two vectors u and v in V a complex number (u, v} and 
that has the following properties: 

(1) Conjugate-symmetry: (v, u) = (u, v) for all u,v € V. 

(2) Linearity in the second factor: (u, vı + avg) = (u, v1) +a (u, v2) , for all 
u,v, v2 E V anda eC. 

(3) Positivity: For all v € V, (v,v) is real and satisfies (v, v) > 0, and 
(v, v) = 0 only if v = 0. 
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Note that in light of the conjugate-symmetry and the linearity in the 
second factor, an inner product must be conjugate-linear in the first factor: 


(vı + ave, u) = (v1, u) +ā (v2, u). 


An inner product on a real vector space is defined in the same way except 
that conjugate-symmetry is replaced by symmetry ((v,u) = (u,v)) and the 
constant a in Point 2 now takes only real values. 

If V is a vector space with inner product, then the norm of a vector v € V, 


denoted ||v||, is defined by 
lull = y (v, v). 


The positivity condition on the inner product guarantees that ||v|| is always 
a non-negative real number and that ||v|| = 0 only if v = 0. 

As an example, consider M,,(C), which is a complex vector space, and 
define an inner product on M,,(C) by 


(A, B) = trace(A*B). (B.4) 


Note that 

trace(A* B) = trace((B*A)*) = trace(B* A), 
which shows that (-,-) is conjugate-symmetric. Linearity in the second factor 
follows from linearity of the trace. Finally, 


n 


trace(A* A) = So (A A)kk 


k,l=1 


oe 
= 5 |Axı|? > 0, 


kl=1 


and the sum is zero only if each entry of A is zero (i.e., only if A is zero). This 
shows that (B.4) defines an inner product. This inner product on the space of 
matrices is called the Hilbert—Schmidt inner product. The norm associated 
to the Hilbert-Schmidt inner product is the norm on matrices introduced in 
Section 2.1. 

For any inner product on a finite-dimensional vector space, we can define 
the adjoint of a linear operator by the condition that (u, Av) = (A*u, v) for 
all u and v in the space. (Compare (B.3).) 

Suppose that V is a finite-dimensional vector space with inner product 
and that W is a subspace of V. Then, the orthogonal complement of W, 
denoted W+, is the set of all vectors v in V such that (w,v) = 0 for all w 
in W. The basic results are (1) (W+)+ = W and (2) V decomposes as the 
direct sum of W and W+. The second point means that every vector v in V 
can be decomposed uniquely as v = w + u, where w € W and u € W+. This, 
in particular, means that dim W + dim W+ = dim V. 
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B.7 Dual Spaces 


A linear functional on a vector space V is a linear map of V into C. If 
U1,--++,Un is a basis for V, then for each set of constants a),...,@n, there is a 
unique linear functional ¢ such that ¢(vk) = ax. If V is a finite-dimensional 
complex vector space, then the dual space to V, denoted V*, is the set of 
all linear functionals on V. This is also a vector space and its dimension is 
the same as that of V. Since V and V* have the same dimension, there is a 
temptation to think of them as being the same space. This temptation should 
be resisted—a failure to distinguish clearly between V and V* is a source of 
much needless confusion. In certain cases, we will want to identify V with V*, 
but I have tried to clearly indicate in all such cases that we are making such 
an identification and how it is made. (The identification is usually made by 
means of an inner product; see below.) 

If W is a subspace of a vector space V, then the annihilator subspace 
of W, denoted W`, is the set of all ¢ in V* such that ¢(w) = 0 for all w in 
W. Then, W` is a subspace of V*. If V is finite dimensional, then dim W + 
dim W` = dim V and the map W > W` provides a one-to-one correspondence 
between subspaces of V and subspaces of V*. 

Suppose that V is a finite-dimensional vector space with an inner product. 
Then, for each u in V, we can define a linear functional 6” € V* by the formula 


p” (v) = (u,v). 


Recall that we take the inner product to be linear in the second spot, so that ġ” 
is indeed a linear functional on V. It can be shown that every linear functional 
on V arises in this way for some unique vector u in V. (If ¢ is zero then u = 0. 
If @ is nonzero then we choose u to be in the orthogonal complement of the 
kernel of ¢, where this orthogonal complement is one dimensional, and adjust 
the normalization of u until 6“ = ¢.) Thus, the map u > ¢” gives a one-to- 
one, onto correspondence between V and V*. However, this correspondence is 
not linear! This is because u goes into the conjugate-linear spot in the inner 
product, as it must, since v needs to go into the linear spot in order for ¢” (v) 
to be a linear functional in v. Actually, the map u — ¢” is conjugate-linear: 
pr = Ag". 

So (in the finite-dimensional case), an inner product gives us a way to 
identify V and V*. However, this identification is not intrinsic; it depends on 
the choice of the inner product. Furthermore, the identification is conjugate- 
linear rather than linear. 


B.8 Simultaneous Diagonalization 


Definition B.9. Suppose that V is a vector space and A is some collection 
of linear operators on V. Then a simultaneous eigenvector for A is a 


300 B Linear Algebra Review 


nonzero vector v E€ V such that for all A € A, there exists a constant Aa with 
Av = av. The numbers A4 are the simultaneous eigenvalues associated 
to v. 


Consider, for example, the space D of all diagonal n x n matrices. Then, for 
each k = 1,...,n, the standard basis element ex is a simultaneous eigenvector 
for D. For each diagonal matrix A, the simultaneous eigenvalue associated to 
ex is the kt? diagonal entry of A. 


Proposition B.10. If A is a commuting family of linear operators on a finite- 
dimensional complex vector space, then A has at least one simultaneous eigen- 
vector. 


It is essential here that the elements of A commute; noncommuting families 
of operators typically have no simultaneous eigenvectors. 

In most cases, the collection A of operators on V is a subspace of End(V), 
the space of all linear operators from V to itself. In that case, if v is a simul- 
taneous eigenvector for A then the eigenvalues 4 for v depend linearly on 
A. (After all, if Ayv = àv and Agu = Agu, then (A; + cAg)u = (Ai + cà2)v.) 
This leads to the following definition. 


Definition B.11. Suppose that V is a vector space and A is a vector space 
of linear operators on V. Define a weight for A to be a linear functional u on 
A such that there exists a nonzero vector v € V with 


Av = p(A)v 


for all A in A. For a given weight u, the set of all vectors v € V satisfying 
Av = p(A)v for all A in A is called the weight space associated to the weight 
u. 


That is to say, a weight is a set of simultaneous eigenvalues for the opera- 
tors in A. If V is finite dimensional and the elements of A all commute with 
one another, then there will exist at least one weight for A. 

If A is finite dimensional and comes equipped with an inner product, then 
it is often convenient to use the inner product to identify A and A* in the 
definition of a weight. From this point of view, we define a weight to be an 
element u of A (not A*) such that there exists a nonzero v in V with 


Av = (u,v) v 
for all A € A. 


Definition B.12. Suppose that V is a finite-dimensional vector space and A 
is some collection of linear operators on V. Then the elements of A are said 
to be simultaneously diagonalizable if there exists a basis U1,...,Un for V 
such that each vp is a simultaneous eigenvector for A. 
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If A is a vector space of linear operators on V, then saying that the ele- 
ments of A are simultaneously diagonalizable is equivalent to saying that V 
can be decomposed as a direct sum of weight spaces of A. 

If a collection A of operators is simultaneously diagonalizable, then the 
elements of A must commute, since they commute when applied to each vx. 
Conversely, if each A € A is diagonalizable by itself and if the elements of 
A commute, then (it can be shown), the elements of A are simultaneously 
diagonalizable. We record these results in the following proposition. 


Proposition B.13. If A is a simultaneously diagonalizable family of lin- 
ear operators on a finite-dimensional vector space V, then the elements of 
A commute. If A is a commuting collection of linear operators on a finite- 
dimensional vector space V and each A € A is diagonalizable, then the ele- 
ments of A are simultaneously diagonalizable. 


We close this appendix with an analog of Proposition B.1 for simultaneous 
eigenvectors. 


Proposition B.14. Suppose V is a vector space and A is a vector space 
of linear operators on V. Suppose p1,..., Hm are distinct weights for A and 
U1,-+-+;Um are elements of the corresponding weight spaces. If vi +:::+Um = 0, 
then vk = 0 for all k =1,...,m. 


C 


More on Lie Groups 


In this appendix, I briefly summarize (without proofs) the notion of a dif- 
ferentiable manifold and the notion of a general (not necessarily matrix) Lie 
group. I then explain briefly the standard approach to the Lie algebra and 
exponential mapping for general Lie groups. This means that the Lie algebra 
is the space of left-invariant vector fields and the exponential mapping is de- 
fined in terms of the flow along such vector fields. Although this approach is 
not used in the rest of the book, I cover it in order to help the reader make 
contact with the approach used in other books. Anyone who is going to delve 
deeply into the theory of Lie groups needs to learn this approach eventually. 
For more information on the manifold approach to Lie groups, see standard 
references such as Warner (1983) or Varadarajan (1974). 

I will also give two examples of Lie groups that are not matrix Lie groups. 
The first is the group described in Section 1.8, which may also be described as 
a quotient of the Heisenberg group by a discrete subgroup of its center. The 
second is the universal cover of SL(n; R). 


C.1 Manifolds 
C.1.1 Definition 


A topological manifold M of dimension n is a topological space (assumed 
second-countable and Hausdorff) that is locally homeomorphic to R”. This 
means that for each point m in M, there is a neighborhood U of m and a 
one-to-one, continuous map ¢ of U into R” onto some open set ¢(U) in R” 
such that the inverse map 7! : ¢(U) — U is also continuous. We may say 
that a manifold is a topological space that looks locally like a little piece of R”. 
We think of the map ¢ as defining local coordinate functions z1, .. ., £n, where 
each zx is the continuous function from U into R given by x;,(m) = ¢(m), (the 
k*» component of ¢(m)). If ~ is another homeomorphism of another neigh- 
borhood V of m, and yx(m) = 7(m), is the associated coordinate system, 
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then both coordinate systems are defined in the neighborhood U N V of m. 
We can then think of the y’s as functions of the z’s; more precisely, we may 
consider the map wo ¢~' that maps the set 6(U NV) onto the set Y(U N V). 
This is the “change of coordinates” map; that is, for all m in U N V, we have 


(yi(m),.--,¥m(m)) = (Wo g™*) (a1(m),...,¢n(m)). 


This change of coordinates map is continuous (since both y and ¢7! are 
continuous). 

A smooth manifold of dimension n is a topological manifold M together 
with a distinguished family of local coordinate systems (Ua, @a) with the 
following properties. (Here, a ranges over some indexing set.) First, every 
point in M is contained in at least one of the U,,’s. Second, for any two of these 
coordinates systems (Ua, a) and (Ug, $g), the change-of-coordinates map ¢g0 
$a! is a smooth map of the set d4(Ua Ug) C R” onto the set ¢g(Ua NUg) C 
R”. In more concrete terms, this means that to make a smooth manifold, 
we start with a topological manifold and then choose a collection of local 
coordinate systems that cover the whole manifold and such that whenever 
two coordinate systems are defined in overlapping regions, the expression for 
one set of coordinates in terms of the other is always smooth. Note that we 
must choose these coordinate systems in order to give a smooth structure to 
the topological manifold M. For some topological manifolds, it is impossible 
to make such a choice: Some manifolds do not admit a smooth structure. 
When a smooth structure exists, it is not unique. (In some cases it is unique 
“up to diffeomorphism,” but it is never actually unique.) 

Once a smooth structure is chosen, we define a smooth local coordinate 
system to be any local coordinate system (U, ¢) (not necessarily one of the 
U,’s) such that 047! is smooth for each (Ua, a). A function f : M > R is 
called smooth if for each smooth local coordinate system (U, ¢), the function 
fod ! is a smooth function on the set (U). Another way to say this is that f 
is smooth if it is smooth in each smooth local coordinate system. If (U, ¢) is a 
smooth local coordinate system and z1,...,2, are the associated coordinate 
functions 2, (m) = ¢(m),, then it is common to write f(xz1,...,2n) to mean 
f(¢-1(a1,...,2n)). In that case, 


of 


Ox, (m) 


means more pedantically the value of the kt” partial derivative of f o ¢7! 
evaluated at the point ¢(m) = (x1(m),...,v%(m)). 


C.1.2 Tangent space 


One way to construct a manifold is as a submanifold of some Euclidean space 
R”. We may think, for example, of a smooth surface S inside R3. In that 
case, the tangent space at a point m € S is the set of vectors v in R? that 


C.1 Manifolds 305 


can be expressed as v = dy/dt|,_.), where y(t) is a smooth curve lying in S 
and satisfying (0) = m. For each point m in S, the tangent space at S is a 
two-dimensional subspace of R?. 

This description is “extrinsic”; that is, it depends on having S embedded 
inside R. We want an “intrinsic” description of the tangent space, one that 
does not depend on having our manifold embedded inside some Euclidean 
space. We look, then, for some aspect of the tangent space that can be de- 
scribed without reference to the embedding. One possibility is to think about 
the directional derivative in the direction of a tangent vector v. If f is a smooth 
function defined on S, we define the directional derivative of f at the point 
m and in the direction of the vector v to be 


(Daf m) = Spero), 


t=0 


where y is any smooth curve lying in S with y(0) = 0 and dy/dt|,_) = v. 
(The value of the directional derivative is independent of the choice of y.) Note 
that the directional derivative associates a number to each smooth function 
f. Also, derivatives of this sort satisfy the usual product rule for derivatives. 

For a general manifold, not necessarily embedded in R™, we define the 
notion of tangent space by abstracting the notion of the directional derivative. 
The tangent space at m to M, denoted Tm( M), is the set of all linear maps 
X from C®(M) into R satisfying (1) the “product rule”: 


X(fg) = X(f)g(m) + f(m)X(g) 


for all f and g in C®(M); (2) “localization”: If f is equal to g in a neighbor- 
hood of m, then X(f) = X(g). This is easily seen to be a real vector space. 
An element of T,,(M) is called a tangent vector at m. If 71,...,¢ is a 
local coordinate system, then one can prove that each tangent vector X at m 
can be expressed uniquely as 


X(f) = Yo a5 (mm) (C.1) 
k=1 
for some real constants a1,...,an. This means that if M is a manifold of 


dimension n, then for each m in M, Tm( M) is a real vector space of dimension 
n. 


C.1.3 Differentials of smooth mappings 


A map © from a manifold M of dimension n; to a manifold M of dimension 
nz is called smooth if it is smooth in local coordinates; that is, ® is smooth 
if, for every coordinate system a on M and every coordinate system ¢g on 
N, ¢g°®0¢,,' is a smooth map from an open subset of R”! into R”?. Given a 
smooth map, one can define the differential (or derivative) of ® at each point 
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m in M, denoted ©, m. The differential ®,,m is the linear map of Tm( M) into 
Ta(m)(NV) given by 
®.m(X)(f) = X(f ° 8), 


where X is a tangent vector at m to M and f is a smooth (real-valued) func- 
tion on N. It is straightforward to check that ©, m(X) is, indeed, a tangent 
vector to N at ®(m) (i.e., that it satisfies the Leibniz rule) and that the map 
©, m is linear. In local coordinates, ® will look like a map from R™ to R”? 
and Ọm will then be essentially the matrix of partial derivatives of ®. The 
differential of ® is sometimes written as d®. 

If 6: M—>N and Y:N > P are smooth maps, then Vo ®: M >P is 
also smooth. The chain rule in this setting takes the form 


(Y (0) D) im = P, am) o Pim (C.2) 


Suppose that y : (a,b) = M is a smooth curve. Then for each t € (a,b) 
we will let dy/dt denote the element of Tya) (M) with the property that 


dy _ POW) 
qin =F (C3) 


for all smooth functions f on M. (Recall that we are thinking of tangent 
vectors as things that operate on functions.) In a smooth local coordinate 
system 21,...,2%n, we can find smooth functions 2}(-),...,@n(-) of one variable 
such that y(t) is the point whose coordinates are 21(t),...,@n(t). (Actually, 
z(t) is nothing but zp(y(t)). Here we make a typical abuse of notation by 
allowing 2; to denote both the coordinate function on M and the associated 
function on R obtained by evaluating 2; on y.) In that case, the chain rule 
tells us that df(y(t))/dt = X Of /Ox; dr, /dt. Thus, (C.3) becomes 


dzk 
2n dt =. cf) 


C.1.4 Vector fields 


A vector field is a map X that associates to each point m in M a tangent 
vector Xm € Tm(M). Given a local coordinate system 21,...,2n a vector 
field can be expressed (in the domain of definition of that coordinate system) 
as 


= f= deal an (C.5) 


where the a,’s are real-valued functions. (Here we are simply using the repre- 
sentation (C.1) at each point.) A vector field is called smooth if the coefficient 
functions a, are smooth in each local coordinate system. We can apply a vec- 
tor field to a smooth function f by applying Xm to f at each point m. The 
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result X(f)is then another function, which will be smooth if X is a smooth 
vector field. So, a smooth vector field is a map from C®(M) > C™(M) that 
satisfies the product rule in the form 


X(fg) = fX(g) + X(f)9. (C.6) 


Note, here, that X(fg) is a function, not a number, and that on the right- 
hand side, we do not evaluate f or g at any point. Equation (C.6) can be 
restated as saying that a vector field is a derivation of the algebra of smooth 
functions. 

One should maintain a geometric picture of a vector field as a collection 
of arrows, one at each point in the manifold. Nevertheless, one can also think 
of the vector field as a differential operator (mapping the space of smooth 
functions to itself), the one obtained by differentiating a function at each 
point in the direction of the tangent vector at that point. 

Looking at (C.5) we see that a vector field can be regarded as a first- 
order differential operator. If we multiply (i.e., compose) two vector fields, 
we will get a second-order differential operator; this is not a vector field. 
However, if X and Y are vector fields and we compute their commutator 
XY — YX, then the second-order terms in XY will cancel with the second- 
order terms in Y X and the result will again be first-order differential operator 
(i.e., a vector field). Alternatively, one can check that if X and Y satisfy the 
product rule (C.6), then so does XY — Y X. The space of smooth vector fields 
then becomes an infinite-dimensional Lie algebra with the bracket defined by 
[X,Y] = XY — YX. (This bracket satisfies the Jacobi identity because the 
composition of differential operators is associative.) 


C.1.5 The flow along a vector field 


If X is a vector field and y : (a,b) > M is a smooth curve in M, then y is 
called an integral curve for X if for each t € (a,b), we have dy/dt = Xy- 
In a smooth local coordinate system 21,...,%n, y(t) will be represented a 
family of functions x; (t),...,2,(t) and the vector field X will be represented 
in the form (C.5) with each a; being a smooth function of 7,...,2n. In light 
of (C.4), the equation dy/dt = Xq) becomes, in local coordinates, 


dx;,(t) 
dt 


= ap(r1(t),...,2n(t)). 


This is a system of first-order ordinary differential equations (not necessarily 
linear). Applying standard results giving uniqueness and local existence for 
solutions of such systems, we obtain the following results. 


Theorem C.1 (Local Existence). Given a smooth vector field X and a 
point m € M, there exist € > 0 and a smooth curve y : (—£,€) > M such 
that y(0) = m and such that y is an integral curve for X. 
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Theorem C.2 (Uniqueness). Suppose that X is a smooth vector field and 
that yı : (—a1,b1) > M and %2 : (—a2,b2) > M are two integral curves 
for X satisfying 71(0) = y2(0) = m. Then, y(t) = y2(t) for all t between 
— min(a1,a2) and min(bı, b2); that is, the two curves agree on the interval 
where both of them are defined. 


Note that existence is merely local: In general, it may be impossible to find 
an integral curve y(t) through a point m that is defined for all t € R. In the 
case M = R, this amounts to asserting that first-order ordinary differential 
equations may not have solutions defined for all time. (Consider, for example, 
the separable equation dy/dt = y”, the solutions of which are y(t) = (c—t)~*.) 

A vector field X is called complete if y(t) can be defined for all t for all 
initial points m. Any vector field on a compact manifold is always complete. 
On a noncompact manifold, some vector fields will be complete and some will 
not be. If X is a complete vector field, then one can define the associated 
flow on M. This is a family of maps ®, : M — M defined so that if y is 
an integral curve for X with 7(0) = m, then ®,(m) = y(t). This means that 
,(m) is defined by starting at m and “flowing” along the vector field X for 
time t. (If X is not complete, one can still define a sort of flow, but then each 
®, is defined only on part of M.) If X is a smooth complete vector field, then 
each ®, is a smooth map of M to itself, and the maps satisfy ®,0 ®, = By45. 


C.1.6 Submanifolds of vector spaces 


If V is a finite-dimensional real vector space, then we may make V into a 
smooth manifold by using a single, globally defined linear coordinate system. 
Given vectors u and v in V, we can define the directional derivative of a 
function f at the point u in the direction of the vector v as 


(Do f)(u) = Fi (u+ te) 


t=0 


For each v, the directional derivative D, satisfies the Leibniz rule. Thus, each 
vector v gives rise to an element of T,,(V). It is not hard to show that every 
tangent vector at u can be expressed in this form. Thus, we have a natural 
way to identify T,(V) with V itself, for each u € V. 

Suppose V is a real vector space of dimension n. A subset M of V is called 
a smooth embedded submanifold of dimension k if given any mo in M, 
there exists a smooth coordinate system (¢,U) defined in a neighborhood U 
of mo such that for any m € U, m is in U N M if and only if ¢(m) is in 
R* C R”. (Here, we think of R* as the subset of R” where the last n — k 
coordinates are zero.) This says that locally, in a suitable coordinate system, 
M looks like R* sitting inside R”. If M is a smooth embedded submanifold 
of dimension k, then we can make M into a smooth manifold of dimension k 
as follows. We use as our basic coordinate neighborhoods the sets of the form 
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UM. We use as our coordinate map on such a set the restriction of ¢ to 
U N M. (This restriction maps U N M to R* c R”.) 

If M is a smooth embedded submanifold of V, then the inclusion map i of 
M into V is a smooth map. (Here, i is defined by i(m) = m for m € M.) The 
differential ix : Tm( M) > Tm(V) is injective, and it is customary to identify 
Tm(M) with its image in Tm(V), which is a k-dimensional subspace of the 
n-dimensional space Tm(V). To say this more explicitly, if X is a tangent 
vector to M at m (viewed as a map of C™®(M) to R), then we can make X 
into a tangent vector X at m to V by defining 


X(f) =X (Flm) 


This allows us to think of the tangent space to M at m as a subspace of the 
tangent space to V at m. However, we are identifying the tangent space at m 
to V with V itself. Thus, the tangent space to M at m is identified with a 
subspace of V. (This subspace depends on the point m.) 

It is not hard to show that if M is a smooth embedded submanifold of V, 
then for each m, Tm( M) (viewed as a subspace of V), is the usual geometric 
tangent space, as follows. 


Proposition C.3. Let M be a smooth embedded submanifold of a finite- 
dimensional real vector space V. Then, for each m € M, the tangent space to 
M atm (regarded as a subspace of V as above) is the set of all u in V such 
that there exists a smooth curve y in M with y(0) = m and dy/dt = u. 


C.1.7 Complex manifolds 


A complex manifold is a smooth manifold of dimension 2n such that the basic 
coordinate patches (Ua, a) have the property that the change-of-coordinates 
map $g0¢,! is holomorphic for each a and 8. Here, R?” is identified with C” 
and holomorphic means the same as complex analytic. Then, any other local 
coordinate system ¢ (not necessarily one of the ¢,’s) is said to be holomorphic 
if the change-of-coordinates map between ¢ and each of the ¢,’s is holomor- 
phic. A map between two complex manifolds is said to be holomorphic if it is 
holomorphic in each holomorphic local coordinate system. If V is a complex 
vector space, then a subset M of V is called an embedded complex sub- 
manifold of dimension k if, given any mp in M, there exists a holomorphic 
local coordinate system (¢,U) defined in a neighborhood U of mo such that 
for any m € U, m is in U N M if and only if (m) is in C} c C”. 


C.2 Lie Groups 


C.2.1 Definition 


A Lie group is a smooth manifold that is also a group. More precisely, we have 
the following. 
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Definition C.4. A Lie group is a smooth manifold G together with a smooth 
map from G x G -+ G that makes G into a group and such that the inverse 
map g —> g7! is a smooth map of G to itself. 


The simplest example is G = R”, with the product map given by (x,y) > 
x+y. A more interesting example was given in Section 1.8. See also Section 
C.3. It is shown in Chapter 2 that every matrix Lie group is a Lie group. (See 
also Subsection C.2.6.) 


C.2.2 The Lie algebra 


If G is a Lie group and g an element of G, we define a map L, : G > G by 
L,(h) = gh. This is the “left multiplication by g” map, which is smooth since 
the product map of G x G to itself is assumed smooth. Then, the differential 
(Lg) of Lg at a point h will be a linear map of T,(G) to Ty,(G). A vector 
field X on G is called left-invariant if X satisfies 


(Lq)«(Xn) = Xoh- 


Let T.(G) denote the tangent space at the identity. Then, given any vector 
v € T.(G), there is a unique left-invariant vector field X” with X? = v, which 
can be constructed by defining 


To show that the vector field constructed in this way is left-invariant, one needs 
to note that Ly o Lan = Lgh, from which it follows (by the chain rule (C.2)) 
that (Lgn)x,e = (Lg)«,n(Ln)«,e- It should be evident that every left-invariant 
vector field arise in this way (with v equal to the value of the left-invariant 
vector field at the identity). The set of all left-invariant vector fields is a real 
vector space whose dimension is the same as that of G, and it is isomorphic 
as a vector space to T.(G) by means of evaluation at the identity. 

Recall that if we think of vector fields as first-order differential operators, 
then the commutator of two vector fields is, again, a vector field. It is not 
difficult to show that the commutator of two left-invariant vector fields is, 
again, a left-invariant vector field. 


Definition C.5. The Lie algebra g of a Lie group G is the tangent space at 
the identity with the bracket operation defined by 


[u, w] = [X°, Xe. 


If we identify the space of left-invariant vector fields with T,(G) by means 
of the map v +—> X”, then g is just the space of left-invariant vector fields, 
which forms a Lie algebra under the commutator of vector fields. 
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C.2.3 The exponential mapping 


The exponential mapping for a general Lie group is defined in terms of the flow 
along left-invariant vector fields. This definition is justified by the following 
result. 


Proposition C.6. If G is a Lie group, then every left-invariant vector field 
on G is complete. 


Definition C.7. Let G be a Lie group and let g = T.(G) be its Lie algebra. 
For each v € g, let X” be the associated left-invariant vector field and let ®? be 
the associated flow. Then, the exponential mapping is the map exp: g > G 
defined by 


exp(v) = ®1 (e). 


This means that to compute exp(v), we first construct the left-invariant 
vector field X” and we then find an integral curve y” to X” that starts at 
the identity. Then, exp(v) = y”(1). To say this yet again, the exponential 
mapping is the time-one flow along a left-invariant vector field starting at the 
identity. 

It can be shown that the exponential mapping is a smooth map of g into G 
and that differential of exp at the origin is the identity map of g to itself. (Here, 
we identity both To(g) and T.(G) with g.) It then follows from the inverse 
function theorem that the exponential mapping takes all sufficiently small 
neighborhoods of the origin in g diffeomorphically onto neighborhoods of the 
identity in G. The properties of the exponential mapping that we have proved 
for matrix Lie groups continue to hold for general Lie groups. For, example, 
exp(v + w) = expv exp w whenever [v, w] = 0, the Lie product formula holds, 
and the Baker-Campbell—Hausdorff formula holds. 


C.2.4 Homomorphisms 


Suppose ® : G > H is a smooth map of a Lie group G to a Lie group H 
that is also a group homomorphism. Then, ® is called a Lie group homo- 
morphism. (In the matrix case, we originally required only that Lie group 
homomorphisms be continuous. However, we proved (Section 2.7) that every 
continuous homomorphism between matrix Lie groups is actually smooth.) 

If ® : G > H is a Lie group homomorphism, then the differential ®, g 
maps T,(G) into Tg:,)(H). In particular, ®, e is a linear map of g = T.(G) 
into b = T.(H). 


Proposition C.8. Let ® : G — H be a Lie group homomorphism and set 
$ = e, so that @ is a linear map of g into h. Then, ¢ is a Lie algebra 
homomorphism and satisfies 


exp(¢(v)) = ®(expv) 


for allu €g. 
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Note that the way ¢ is defined is consistent with the way we defined things 
in the matrix case (Theorem 2.21). 

For each g in G, let C4 : G > G be the “conjugation by g” map; that is, 
C,(h) = ghg'. For each g, Cy is a Lie group homomorphism of G to itself. 
Thus, the differential of Cy at the identity is a Lie algebra homomorphism of 
g to itself. This map is denoted Ad,: 


Ad, = (Cg)x,e- 


This can be computed in a more concrete way as 


d i 
Ad,(X) = gI 9 1, 
t t=0 
Note, here, that ge’*g~! is a smooth curve in G that passes through the 


identity at t = 0, and, thus, the derivative of this curve at t = 0 is an element 
of g = Te(G). 


C.2.5 Quotient groups and covering groups 


Given a connected smooth manifold M, there is a construction (which I will 
not attempt to describe here) that yields a simply-connected manifold M 
together with a map ®: M —> M with the following property: Each m € M 
has a neighborhood U such that ®~'(U) is the disjoint union of open sets 
Va, each of which is mapped by © diffeomorphically onto U. Such a pair 
(M,®) is called a universal cover of M and is “unique up to canonical 
diffeomorphism.” If M = G is a connected Lie group, then the universal 
cover G can be given a group structure in a canonical way and the map ® 
in this case is a Lie group homomorphism of G onto G. The associated Lie 
algebra of ¢ : g — g is an isomorphism; therefore, we often say that G and G 
have “the same” Lie algebra. 

This construction illustrates the advantage of working in the general Lie 
group setting: Every Lie group has a universal cover that is, again, a Lie group 
and that can be constructed in a canonical way. By contrast, the universal 
cover G of a matriz Lie group G may not be a matrix Lie group, and even 
if it is, there is no canonical procedure for finding a matrix representation of 
G. For example, the universal cover of SL(n;R), n > 2, is not a matrix Lie 
group, as shown in the next section. 

Meanwhile, suppose that G is a Lie group and that N is a closed normal 
subgroup of G. Then, there is a unique manifold structure on the quotient 
group G/N that makes G/N into a Lie group and such that the quotient map 
G — G/N is then a Lie group homomorphism. That this procedure can be 
carried out for any Lie group G and any closed normal subgroup N again 
illustrates the advantage of working with general Lie groups. By contrast, if 
G is a matrix Lie group, G/N may not be, as the first example in the next 
section shows. Furthermore, even if G/N happens to be a matrix Lie group, 
there is no canonical procedure for finding a matrix representation of it. 
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C.2.6 Matrix Lie groups as Lie groups 


We have proved that every matrix Lie group is a smooth embedded submani- 
fold of the vector space V = M,,(C). (Here we think of M,,(C) as a real vector 
space of dimension 2n?. A matrix Lie group is in general only a real embedded 
submanifold of M,,(C) and not necessarily a complex embedded submanifold.) 
Since the matrix product and matrix inverse are smooth on the open subset 
GL(n;C) of M,(C), this shows that every matrix Lie group is a Lie group. 
(The restrictions of smooth mappings to smooth embedded submanifolds are 
smooth.) 

Meanwhile, the Lie algebra g of a matrix Lie group G (as we have defined 
g in Chapter 2) is just the tangent space to G at the identity. To see this, note 
that every X in g is the derivative of a smooth curve through the identity, 
namely the curve y(t) = e’*. Conversely, using the local logarithm , one can 
show that if X is the derivative of any smooth curve in G passing through I 
at t = 0, then X is in g. (See Corollary 2.35 in Section 2.7.) 

It remains to show that the exponential map as we have defined it in the 
matrix case agrees with the exponential map as we have defined for general 
Lie groups. If X € g, what we need to show is that the curve 


y(t) = el 


is an integral curve for the left-invariant vector field whose value at the identity 
is X. This means that we must show that 


d ix 
— = (Lee : 
ae ( e x), (X) 
To do this, we note that 
d tx = d (t+a)X d tX aX 
dt da a0 da EN 
d 
= (Letx ) pete = (Lex), (X) 
da a=0 


The second-to-last equality is essentially the chain rule. 

We conclude, then, that every matrix Lie group is a Lie group and that the 
way the Lie algebra and the exponential mapping are defined in the matrix 
case is consistent with the way the Lie algebra and the exponential mapping 
are defined for general Lie groups. 


C.2.7 Complex Lie groups 


A complex Lie group is a complex manifold endowed with a group structure 
in such a way that the product and inverse maps are holomorphic. 

Suppose that G is a Lie group of dimension 2n with Lie algebra g. If 
there exists a real-linear map J : g —> g such that (1) J? = —I and (2) 
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[JX,Y] = J[X,Y] for all X and Y in g, then we say that J is a complex 
structure on the Lie algebra g. In that case, we can regard g as a complex Lie 
algebra by defining the “multiplication by i” map to be J. Condition 1 tells 
us that J makes g into a complex vector space. Condition 2 then implies that 
the bracket is complex-bilinear. 

If G is a complex Lie group, then there is a natural way of defining a map 
J on g so that g becomes a complex Lie algebra and so that. the exponential 
mapping is holomorphic from g into G. Conversely, suppose G is an even- 
dimensional Lie group and we can find a map J : g > g that makes g into a 
complex Lie group. Then G can be given the structure of a complex manifold in 
such a way that G becomes a complex Lie group and the exponential mapping 
is holomorphic. (There may be many different possible complex structures on 
G that make G into a complex Lie group. In that case, different complex 
structures on G will correspond to different complex structures J on g.) To 
oversimplify slightly, we may say that G is a complex Lie group if and only if 
g is a complex Lie algebra. See Varadarajan (1974) for more information. 

If G C GL(n;C) is a matrix Lie group and its Lie algebra g is a complex 
subalgebra of gl(n;C), then g has a complex structure, namely the usual mul- 
tiplication by i map. In that case, it can be shown that G is an embedded 
complex submanifold of GL(n;C) and, thus, a complex Lie group. This shows 
that the definition of a complex matrix Lie group given in Chapter 2 (Defi- 
nition 2.20) is sensible: A complex matrix Lie group is indeed a complex Lie 
group. 


C.3 Examples of Nonmatrix Lie Groups 


We explained in the previous section that every matrix Lie group is (as the 
name suggests) a Lie group. In this section, we will show that the converse is 
not true: Not every Lie group is isomorphic to a matrix Lie group. 

Our first example is the Lie group G introduced in Section 1.8, namely 
G =R x R x S}, with the group product defined by 


(21,41, U1) * (£2, Y2, U2) = (£1 + 22, y1 + Yo, E17 ¥ U1 U2). 


Meanwhile, let H be the Heisenberg group (i.e., the group of 3x3 real matrices 
that are upper triangular matrices with ones on the diagonal). Consider the 
map ®: H > G given by 


lab 
{| 01c] =(a,ce”). 
001 


Direct computation shows that ® is a homomorphism. The kernel of ® is the 
discrete normal subgroup N of H given by 
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1 0 2mm 
N= 01 0 neZ 
00 1 


Now, suppose that W is any finite-dimensional representation of G (i.e., a 
continuous map of G into some GL(n;C)). Then, we can define an associated 
representation X of H by X = Wo®. Clearly, the kernel of any such represen- 
tation of H must include the kernel N of ®. Now, let Z(H) denote the center 
of H, which is easily shown to be 


106 
Z(H) = 010]|bER 
001 


Theorem C.9. Let © be any finite-dimensional representation of H. Ifker £ > 
N, then ker X D Z(H). 


Once this is established, we will be able to conclude that there are no 
faithful finite-dimensional representations of G. After all, if Y is any finite- 
dimensional representation of G, then the kernel of © = Wo © will contain N 
and, thus, Z(H), by the theorem. Thus, for all b € R, 


10% l 
E|010 | = W(0,0,e%) =I. 
001 


This means that the kernel of Y contains all elements of the form (0,0, u) and 
W is not faithful. So, we obtain the following result. 


Corollary C.10. The Lie group G has no faithful finite-dimensional repre- 
sentations. In particular, G is not isomorphic to any matrix Lie group. 


We now proceed with the proof of Theorem C.9. 


Proof. Let o be the associated representation of the Lie algebra h of H. Let 
A, B, and C be the basis elements for h given by 


010 001 000 
A=|000], B=|000], C=j{001 
000 000 000 


These satisfy the commutation relations [A,C] = B and [A, B] = [C, B] = 0. 
Thus, [o(A),0(C)] = o(B) and [o(A),o(B)] = [o(C), 0(B)] = 0. 

I now claim that o(B) must be nilpotent. In light of the SN decomposition, 
this is equivalent to showing that all of the eigenvalues of o(B) are zero. So, 
let A be an eigenvalue for o(B) and let V) be the associated eigenspace. 
Certainly, Vy is invariant under o(B) since o(B) = AI on Vy. Furthermore, 
since o0(A) and o(C) commute with o(B), they must also leave V) invariant 
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(Proposition B.4). The restrictions of these operators to Vy must still satisfy 
the same commutation relations as they do on the whole space. This means 
that the restriction of o(B) to Vy is the commutator of two operators on Vy 
(i.e., operators that map Vy to itself), namely the restrictions of o(A) and 
a(C) to Vy. Since o(B)|y, is a commutator, its trace must be zero (since 
the trace of UV is the same as the trace of VU for any operators U and V). 
On the other hand, o(B)|), = AI. So, 0 = trace(o(B)|y,) = Adim Vy. If A 
is actually an eigenvalue, then dim V) 4 0 and, thus, we must have A = 0. 
Since À was an arbitrary eigenvalue of o(B), we conclude that 0 is the only 
eigenvalue of o(B) and, thus, o(B) is nilpotent. 


Lemma C.11. If X is a nonzero nilpotent matriz, then for all nonzero real 
numbers t, eX £I. 


Proof. Since X is nilpotent, the power series for e’* terminates after a finite 
number of terms. Thus, each entry of e’* depends polynomially on t; that is, 
there exist polynomials p,i(t) such that (e’*),. = ppı(t). Now, suppose that 
there is some nonzero tg such that et°¥ = J. Then, e°* = J” = J for all 
integers n. In terms of the polynomials pxı, this means that ppı(nto) = dx 
for all n. However, a polynomial that takes on a certain value infinitely many 
times must be constant. Thus, as soon as there is one nonzero tg for which 
et0X = I, we must have e'* = J for all t (assuming, still, that X is nilpotent!). 
This, however, would then imply that X = d/dt(exptX)|:<0 = 0. So, if X is 
nonzero and nilpotent, e** must be different from the identity for all nonzero 
t. o 


Now, we note that the 3 x 3 matrix B satisfies B? = 0, and, so, 


10t 
e8 = |010 
001 


Suppose now that X is a representation of H and that ker © D> N. Then, 


e2?7na(B) S53 (ean) =] 


for all integers n. Since o(B) is nilpotent, the only way this can happen 
(according to the lemma) is if o(B) is zero. Thus, for all real numbers t, 


zy (eP) = ete (B) =I 
and ker X D Z(H). o 


We now turn to our second example of a nonmatrix Lie group. We make 
use of the following topological result: For all n > 2, SL(n; R) is not simply 
connected and SL(n; C) is simply connected. This result was recorded in the 
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table in Chapter 1; the method of proof is described in Appendix E. We also 
make use of the concept of the universal cover of a Lie group, as described 
in the previous section. The universal cover of SL(n;R), denoted SL(n; R), 
is a _ simply-connected Lie group together with a Lie group homomorphism 
® : SL(n;R) —> SL(n;R) such that the associated Lie algebra homomorphism 
Q: si(n; R) — sl(n;R) is an isomorphism. It is customary to permanently 
identify sl(n; R) with sl(n;R) by means of the map ¢ and say (in a slight 
abuse of terminology) that the Lie algebra of SL(n; R) ts sl(n; R). 


Theorem C.12. There does | not exist any faithful finite-dimensional repre- 
sentation of SL(n;R). Thus, SL(n;R) is a Lie group that is not isomorphic to 
any matrix Lie group. 


Proof. We will show that if II is any finite-dimensional representation of 
SL(n; R), then the kernel of II contains the kernel of the homomorphism 
® : SL(n;R) > SL(n;R). Since SL(n;R) is not simply connected, ® has a 
nontrivial kernel and, thus, II is not faithful (i.e., has a nontrivial kernel). 
So, now let II be a finite-dimensional representation of SL(n; R), acting on 
a finite-dimensional complex vector space V. We then have an associated Lie 
algebra representation 7 of sl(n;R), which is (isomorphic to) the Lie algebra 
of SL(n;R). We may extend m by complex-linearity to a representation of 
sl(n;C), also denoted m. Then, since SL(n;C) is simply connected, we may 
exponentiate 7 to a representation IT’ of SL(n; C). Finally, we may restrict I’ 
to SL(n; R). 
Now, construct a new representation © of SL(n; R) by setting © = II’ o 
®. I claim that X coincides with the original representation II of SL(n; R). 
To see this, consider the associated Lie algebra representation o = T’ o @. 
Now, we are regarding ¢ as the identity map of sl(n;R) to itself. Meanwhile, 
by construction, the Lie algebra map 7’ associated to II’ is m. So, in fact, 
ga = T. Since SL(n;R) is connected this implies that the associated group 
homomorphisms © and II are equal. (We have proved such a result for matrix 
Lie groups; the same result with much the same proof holds for all Lie groups.) 
So, every representation H of SL(n; R) is of the form II = II’ o ® for some 
representation II’ of SL(n; R). Thus, ker II D ker ® 4 {e} and II is not faithful. 
O 


These two examples illustrate the limitations of matrix Lie groups with 
respect to the operations of quotients and universal covers. The group G is 
isomorphic to the quotient group H/N, and although H is a matrix Lie group, 
H/N is not. Similarly, SL(n; R) is a matrix Lie group, but its universal cover 
is not. 
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Recall from Section 4.10 the notion of Haar measure. A left Haar measure 
on a Lie group G is a nonzero measure that is locally finite and left-invariant. 
The simplest way to prove the existence of such a measure is to use differential 
forms. 

Differential forms are an important aspect of the theory of differentiable 
manifolds. I have no space here to describe this notion in any detail, but the 
idea is roughly this. A k-form 7 on an n-dimensional manifold M is an object 
that can be expressed in local coordinates as 


n(m) = Sangin (m) dzi, A dzi, A++ A dTi,- 


Here, ^ is the “wedge product,” which is defined in such a way that the 
expression dz;, ^ dzi, A++- A dzi, changes sign with the interchange of any 
two factors. In coordinate-independent language, a k-form is something that 
takes values at each point in the k* exterior power of the cotangent space 
at m, where the cotangent space is defined as the dual space to the tangent 
space at m. The significance of the concept of k-forms is that there is a 
natural (coordinate-independent) way of integrating a k-form over (oriented) 
k-dimensional submanifolds of M. In particular, if M is oriented and 7 is an 
n-form, then it makes sense to integrate 7 over M itself. 

Given an n-dimensional oriented manifold and an nowhere-vanishing ori- 
ented n-form ņ, we can make a measure on M by defining the integral of f 
against u to be the integral of the n-form fn. It is not hard to show that on an 
n-dimensional Lie group G, there exists a nowhere-vanishing n-form that is 
invariant under left translations and that this form is unique up to a constant. 
Integrating functions against this form (with the correct orientation) gives a 
left-invariant measure (i.e., a left Haar measure). 

Suppose that p is a left Haar measure on G and g is an element of G. 
Suppose we define a new measure R,(j) by setting Ry(u)(Z) = w(R,£), 
where R, denotes right-translation by g. Then, it is not hard to see that 
R(n) is again left-invariant (because left-translations commute with right- 
translations) and again given by integration against a left-invariant n-form. 
However, the left-invariant n-form describing Rg(u) may not be equal to the 
left-invariant n-form describing p; it may be a constant multiple of that n- 
form. So, given any g € G, there is a constant x(g) such that Rg(u) = x(g)u, 
where p is a left Haar measure. The function x(g) is called the modular 
function of G. A group G is called unimodular if the modular function is 
identically equal to one. If G is unimodular, then R,(u) = u (ie., the left 
Haar measure is also right-invariant). 

Using the description of Haar measure in terms of left-invariant n-forms, 
one can show the following result. 


Proposition C.13. If G is a connected Lie group, then G is unimodular if 
and only if 
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det (Ad) = 1 
for all g € G or, equivalently, if and only if 


trace (adx) =0 
for all X €g. 


Here, Ad, and adx are viewed as linear maps of the Lie algebra g to itself. 
The reason for this result is as follows. A right-translation can be expressed as 
a combination of a left-translation and the adjoint action: R,(h) = AdgL,(h). 
So, G is unimodular if and only if the left Haar measure is invariant under the 
adjoint action. Since everything in sight is left-invariant, it suffices to check 
the invariance of a left-invariant n-form under Ad, at the identity. So, the 
relevant question is how Ad, acts on the nt? exterior power of the cotangent 
space at the identity, which is by the determinant of Ady. 

This proposition implies that compact Lie groups are unimodular, since, 
then, there exists an inner product with respect to which adx is skew- 
symmetric, and the trace of a real skew matrix must be zero. More generally, 
it can be shown that all semisimple groups are unimodular. Meanwhile, the 
Heisenberg group is also unimodular because in this case adx is nilpotent 
for all X in the Lie algebra. The simplest example of a group that is not 
unimodular is the following two-dimensional matrix group: 


e={ (03) 


The reader is invited to verify that there exists elements X of the Lie algebra 
g of G such that trace(adx) Æ 0. 


ae (0,00), DER}. 


D 


Clebsch—Gordan Theory for SU(2) and the 
Wigner—Eckart Theorem 


D.1 Tensor Products of sl(2;C) Representations 


Recall from Section 4.6 the notion of the tensor product of representations of 
a group or Lie algebra. We consider this in the case of the irreducible repre- 
sentations of the group SU(2) or, equivalently, the irreducible complex-linear 
representations of sl(2;C). These irreducible representations were classified 
in Section 4.4. For each non-negative integer m, we have an irreducible rep- 
resentation (nm, Vm) of sl(2;C) of dimension m + 1, and every irreducible 
representation of sl(2;C) is isomorphic to one of these. We regard the tensor 
product Vm & Vn as a representation of sI(2;C). (Recall that it is also possible 
to view Vm Vn as a representation of sl(2;C) @sl(2;C).) The action of sl(2;C) 
on Vm ® Vn is given by 


(1m Q Tn)(X) = nm X) @1+1@ mX). (D.1) 


We use the standard basis {X, Y, H} for sl(2;C). 

By the averaging method of Section 4.10, we can find on each space Vm an 
inner product that is invariant under the action of the compact group SU(2). 
(In the case of V; S C?, we can use the standard inner product on C?.) With 
respect to such an inner product, the orthogonal complement of a subspace 
invariant under SU(2) (or su(2) or sl(2;C)) is again invariant under SU(2) (or 
su(2) or sl(2;C)). Since the element H of sl(2;C) is in isu(2), ™(H) will be 
self-adjoint with respect to this inner product. This means that eigenvectors 
of tm(H) with distinct eigenvectors must be orthogonal. Once we have chosen 
SU(2)-invariant inner products on Vm and Vn, there is a unique inner product 
on Vm 8 Vn with the property that (u1 Q v1, u2 Q v2) = (u1, u2) (v1, v2) . (This 
can be proved using the universal property of tensor products.) The inner 
product on Vm Q Vn is also invariant under the action of SU(2). We assume 
in the rest of this section that an inner product of this sort has been chosen 
on each Vm ® Vn. 

In general, Vm Q Vn will not be an irreducible representation of sl(2;C); 
the goal of this section is to describe how Vm ®V, decomposes as a direct sum 
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of irreducible invariant subspaces. This decomposition is referred to as the 
Clebsch-Gordan theory or (in the physics literature) as addition of angular 
momentum. Here, we use the mathematician’s labeling of the representations 
of sl(2; C); physicists normally label the representations by the “spin” | = m/2, 
so that the possible values of l are l = 0, E, 1; 3, herb 

Let us consider first the case of Vi ® Vi, where V; = C?, the statin 
representation of sl(2;C). If {e1,e2} is the standard basis for C?, then the 
vectors of the form ep Qe, 1 < k,l < 2, form a basis for C? @C?. Since e; and 
e2 are eigenvalues for 7(H) with eigenvalues 1 and —1, respectively, then, by 
(D.1), the basis elements for C? @ C? are eigenvectors for the action of H with 
eigenvalues 2, 0, 0, and —2, respectively. Since 2 is the largest eigenvalue for 
H, the corresponding eigenvector e; & e} must be annihilated by X (i.e., by 
the operator 7(X) @I +I] @7,(X)). If we apply Y repeatedly to e1 ® e1, we 
obtain e; ® e2 + e2 Q e; and then 2e2 Q eg and then zero. The space spanned 
by these vectors is invariant under sl(2;C) and irreducible, isomorphic to the 
three-dimensional representation V2. The orthogonal complement of this space 
in C? @C?, namely the span of e1 Qez — ez Q é1, is also invariant, and sl(2; C) 
acts trivially on this space. So, we have 


TC? @C? = span{e; 8 €1, €] Q eg + €2 Q €1, €2 Q eg} © span{e1 ® ez — €2 Bey}. 


Thus the four-dimensional space V; ® Vı is isomorphic, as an sl(2;C) repre- 
sentation, to Vo @ Vo. 


Theorem D.1. For any non-negative integer k, let Vp denote the irreducible 
representation of sl(2;C) of dimension k +1. For two non-negative integers 
m and n, consider Vm D Vn as a representation of s\(2;C). Assume m > n. 
Then, 

Vin ® Vn = Vm+n ® Vm+n-2 ®-- Vm-n+2 ® Vm-n, 


where = denotes an equivalence of sl(2;C) representations. 


Note that this theorem is consistent with the special case worked out 
earlier: Vi ® Vi = V2 © Vo. Note that each each irreducible representation 
occurring in the decomposition of Vm @V,, occurs only once. This is a special 
feature of the theory of sl(2;C) representations and the analogous statement 
does not hold for tensor products of representations of other Lie algebras. 


Proof. Let us take a basis for each of the two spaces that is labeled by the 
eigenvalues for H. So, we have a basis um,Um_—2,---,U2—-m,U—m for Vm and 
Un, Un—2)+++;U2—n,U—n for Vn. Then, the vectors of the form uk Q vı form a 
basis for Vm ® Vn, and we compute that 


[nm(H) 8 I +18 tn(H)\uz 8v = (k + luk 8 v. 


So, each of our basis elements is an eigenvector for the action of H in Vm Q Vn. 
Let us work out the eigenspaces for H in Vm ® Vn. The eigenspace with 
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eigenvalue m + n is one dimensional, spanned by um 8 Un. If n > 0, then the 
eigenspace with eigenvalue m +n — 2 has dimension 2, spanned by um—2 ® Un 
and Um ®Un—2. Each time we decrease the eigenvalue of H by 2 we increase the 
dimension of the corresponding eigenspace by 1, until we reach the eigenvalue 
m-—n, which is spanned by the vectors um—2n@Un, Um—2n+28Un—2, and so on 
up tO Um ®@v_y. This space has dimension n+1. As we continue to decrease the 
eigenvalue of H in increments of 2, the dimensions remain constant until we 
reach eigenvalue n — m, at which point the dimensions begin decreasing by 1 
until we reach the eigenvalue —m — n, for which the corresponding eigenspace 
has dimension one, spanned by Uu-m ® v_n. This pattern is illustrated by the 
following table, which lists, for the case of V4 @ V2, each eigenvalue for H and 
a basis for the corresponding eigenspace. I leave it to the reader to verify that 
this pattern holds true in general. 


Eigenvalue for H Basis 


6 u4 Q VQ 

4 uz X V2, U48 Vo 

2 Up © VQ, U2 Qvo, U4 @v_g 
0 U2 @ V2, Up Q Vo, U2@v_2 
—2 U4 @ V2, U-2@ Vo, Up B v_2 
—4 U4 8 Vo, U-28 v-2 

—6 u—-4 Q V2 


Now, consider the vector um ® Un, which is annihilated by X and is an 
eigenvector for H with eigenvalue m+n. Applying Y repeatedly gives a chain 
of eigenvectors for H with eigenvalues decreasing by 2 until they reach —m—n. 
According to Theorem 4.12, the span of these vectors is invariant under sl(2; C) 
and irreducible, isomorphic to Vin4n- 

The orthogonal complement of the invariant subspace W obtained in the 
previous paragraph is also invariant. Since W contains each of the eigenvalues 
of H with multiplicity one, W+ will have the multiplicity of each H-eigenvalue 
lowered by 1. So, m+n is not an eigenvalue for H in W+; the largest remaining 
eigenvalue is m +n — 2 and this has multiplicity one. So, if we start with an 
eigenvector for H in W+ with eigenvalue m+n -—2, this will be annihilated by 
X and will generate an irreducible invariant subspace isomorphic to Vm+n-2- 

We now continue on in the same way, at each stage looking at the orthogo- 
nal complement of the sum of all the invariant subspaces we have obtained in 
the previous stages. Each step reduces the multiplicity of each H-eigenvalue by 
1 and thereby reduces the largest remaining H-eigenvalue by 2. This process 
will continue until there is nothing left, which will occur after Vin—n. 

In the case of V, Q V2, we will get a seven-dimensional invariant subspace 
isomorphic to Vg, then a five-dimensional invariant subspace isomorphic to 
V4, and then a three-dimensional invariant subspace isomorphic to V2. Actu- 
ally working out what these invariant subspaces are is complicated, but, in 
principle, possible. Oo 
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D.2 The Wigner—Eckart Theorem 


Recall that the Lie algebras su(2) and so(3) are isomorphic. Specifically, we 
use the bases {£), E2, E3} for su(2) and {F}, F2, F3} for so(3) described in 
Section 4.9. Then, the unique linear map ¢ : su(2) —> so(3) such that ¢(E;) = 
Fk, k = 1,2,3, is a Lie algebra isomorphism, as a simple calculation will 
confirm. So, the representations of so(3) are in one-to-one correspondence with 
the representations of su(2), which, in turn, are in one-to-one correspondence 
with the complex-linear representations of sl(2;C). So, the analysis of the 
decomposition of tensor products of sl(2;C) representations in the previous 
section applies also to so(3) representations. 

Now, suppose that 7 is a so(3) representation acting on a finite-dimensional 
vector space V. Let End(V) denote the space of endomorphisms of V (i.e., the 
space of linear operators of V into itself). Then, we can define an associated 
representation 7 of so(3) acting on End(V) by the formula 


T(X)(C) = [r(X),C], X €s0(3), C € End(V); 


that is, t is the map X —> ad,:x). Since the maps X > 7(X) and C > 
adc are Lie algebra homomorphisms, 7 is also a homomorphism and, thus, 
a representation of so(3) acting on End(V). (This is the standard way of 
“promoting” the action of a Lie algebra on a vector space V to an action on 
End(V).) 

Recall that the elements of so(3) are 3 x 3 real skew-symmetric matrices. 
These matrices then act on R3; this is the standard representation of so(3). 


Definition D.2. Suppose that r is a representation of so(3) acting on a finite- 
dimensional space V and 7 is the associated representation of so(3) acting on 
End(V). Then, a linear map A: R? —> End(V) is called a vector operator 
(acting on V) if A is an intertwining map of so(3) representations, that is, if 


A(Xv) = [a(X), A(v)] (D.2) 
for all X € so(3) and all v € R3. 


Let us try to understand more concretely what this means. Since A is 
assumed to be linear, it is determined by its values on the basis elements 
e1,€2,€3 for R3. So, let Ap = A(ep), k = 1,2,3. In the physics literature, one 
thinks of A as a “triple of operators” A = (Aj, A2, A3), in which case the 
linear map A : R? — End(V) is the same as 


A -v = Åv + Agve + Agus. 


Of course, not every triple of operators gives rise to a vector operator; the 
map A must satisfy (D.2). It suffices to check (D.2) for X ranging over a basis 
of so(3) and v ranging over a basis of R. So, using the basis {F,, Fz, F3} for 
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so(3) and the standard basis {e1, e2,e3}, (D.2) is equivalent (given a linear 
map A: R3 > End(V)) to the assertion that 


A(Frei) = [w(Fr), A(er)| (D.3) 


for k = 1,2,3 and l = 1,2,3. 
Now, I claim that the 3x3 matrices F1, F2, and F3 from Section 4.9 satisfy 


3 
Fre, = y EklmEms (D.4) 
m=1 


where Exim is defined by 


0 if any two of k,l,m are equal 
kin = Lif (k,l,m) is a cyclic permutation of (1, 2,3) 
—1 if (k,l,m) is an non-cyclic permutation of (1, 2, 3). 


Concretely, the last two conditions mean that €123, €231, and €312 are equal to 
1 and €132, €213, and £321 are equal to —1. Let us check (D.4), say, in the case 
k =l = 1. We note that £11m = 0 for all m and, so, (D.4) says that Fie, = 0, 
which is true. Let us also check (D.4) in the case k = 1 and l = 2. We note 
that £12m is nonzero only when m = 3, in which case its value is 1, so (D.4) 
says that Fe. = e3, which is also true. I leave it to the reader to check the 
correctness of (D.4) in the remaining cases. 


Proposition D.3. Suppose x is a representation of so(3) acting on a space V 
and define Jy, k = 1,2,3, by Jy = n(Fk). Now, suppose that A: R3 + End(V) 
is a linear map and define Ay, k = 1,2,3, by Ak = A(ex). Then, A is a vector 
operator if and only if 


3 
[Je, Ar] = y EklmAm (D.5) 
m=1 


for all k,l € {1, 2, 3}. 


Proof. This is just (D.3) written out (in the reverse order) using the expression 
(D.4) for Fẹ and setting J, = 1(F,) and Ay = A(ex). o 


Condition (D.5) differs from the one in the physics literature by a factor 
of i, reflecting a factor of i difference between the mathematics and physics 
literatures regarding the definition of the Lie algebra so(3). 


Proposition D.4. Suppose that x is any finite-dimensional representation of 
so(3) acting on a space V. Let J the unique linear map from R? into End(V) 
satisfying J(ex) = n(Fk), k = 1,2,3. Then, J is a vector operator. 


That is to say, if we identify R3 with so(3) by identifying the basis e1, €2, €3 
for R? with the basis F}, F3, F3 for so(3), then 7 itself is a vector operator; in 
physics notation, J = (Ji, J2, J3) is a vector operator. 
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Proof. The commutation relations among the F,’s (computed in Section 4.9) 
may be expressed as 


3 
[Fk, Fi] = 5 Eklm Fm- (D.6) 
m=1 


Since 7 is a representation, the J;,’s (defined as 7(F},)) have the same commu- 
tation relations as the Fy’s, [Jz, Ji] = ae EklmJm. This means that (D.5) 
is satisfied if A, = Ją, which shows that J is a vector operator. o 


Let us now assume that we have an inner product (-,-) on V that is invari- 
ant under the action of so(3), in the sense that (7(X)u, v) = — (u, n(X)v} for 
all X € so(3) and u,v € V. We call this a “unitary” representation of so(3). 
(After all, so(3) is mapping into u(V), the skew self-adjoint operators on V, 
which is the Lie algebra of the group U(V) of unitary operators on V.) 


Theorem D.5 (Wigner—Eckart). Suppose that V is a finite-dimensional 
unitary representation of so(3) and that A,B : R? — End(V) are vector 
operators. Set Ay = A(ex) and Bk = B(F,), k = 1,2,3. Suppose that W, 
and W are irreducible so(3)-invariant subspaces and suppose that (w, Agw’) 
is nonzero for some w E€ Wi, w' € Wo, and k € {1,2,3}. Then, there exists a 
constant c such that 

(w, Bew’) = c (w, Arw’) 


for all w € Wi, w € Wo, and k € {1,2,3}. 


The constant c depends on the choice of W1, W2, A, and B, but is inde- 
pendent of w, w’, and k. 

In application in physics, V is usually infinite dimensional (but W, and 
W, are still finite dimensional). The result still holds in that case, subject to 
certain technical conditions. 

Suppose that W, = Vm and Wz & Vn. There is then a standard way of 
choosing a basis Um, Um—2,---;U2—-m,U—m for Wy and Vn, Un—2,.--,V2—-n, V-n 
for Wz. The Wigner—Eckart theorem says that the matrix entries (uw, Bev’) 
are determined up to a constant merely by the fact that B is a vector operator. 
There exist certain “universal” coefficients a(m,n,k,l,l’), given in terms of 
the Clebsch—Gordan coefficients, that can be computed once and for all and 
that capture these matrix entries up to a constant. So, the Wigner—Eckart 
theorem can be expressed as saying that if B is a vector operator acting on 
V, and W, and W3 are irreducible so(3)-invariant subspaces isomorphic to Vm 
and Vn, respectively, then 


(uw, Bru) = ca(m,n,k, 1,1’). 


Note that the subspaces W and Wp are assumed invariant under the 
action of so(3), but are not necessarily invariant under the A;’s or Bp’s. 

Before turning to the proof of the Wigner—Eckart Theorem, we consider a 
useful isomorphism. Suppose now 7, and 7 are representations of some Lie 
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algebra g acting on finite-dimensional vector space W, and W2, respectively. 
Let Hom(W2, W1) denote the space of linear maps of Wz into Wj. Define a 
representation of g acting on Hom(W2, W,) as follows. If X € g and C € 
Hom(W2, W1), then let the action of X on C be given by 


m1 (X)C — Cro(X). (D.7) 


It is easily verified that this formula does, indeed, make Hom(W2, W1) into a 
representation of g. 


Lemma D.6. Suppose that mı and mz are representations of a Lie algebra g 
acting on finite-dimensional spaces W, and W2, respectively. Define an action 
of g on Hom(W2, W1) by (D.7). Then, 


Hom(W2, W1) = Wy 8 Wi, 
where = denotes equivalence of representations of g. 


Proof. Consider the unique linear map Y : W3 @ Wı —> Hom(W2, W1) such 
that for all ọ € WF and u € Wy, 


V(b @ u)(v) = o(v)u. 


(The universal property of the tensor product guarantees that there is, in fact, 
a unique such map.) It is easily verified, using bases for W; and W3, that V is 
an invertible linear map. We now verify that Y is an intertwining map for the 
actions of g on Wł @W, and on Hom(W2, W1). Consider elements ¢ in WZ and 
u in W,. Suppose we first let an element X of g act on ¢ ® u. Recalling from 
Chapter 4 the way one takes duals and tensor products of representations, we 
get 
—(¢0 m2(X)) 8u +¢ 8 (m(X)u). 


If we then apply the map W to this element and apply the resulting operator 
to a vector v € W2, we get 


—$(m2(X)v)u + o(v)m (Xu. (D.8) 


Suppose, on the other hand, that we first apply Y to ¢ & u, then let X act 
on the resulting operator by means of (D.7), and then apply the result to the 
vector v. A simple calculation shows that the result is, again, the quantity in 
(D.8). Thus, Ų intertwines the actions of g on elements of the form ¢ © u. 
Since every element of Wz & W is a linear combination of elements of this 
form, we conclude that Y is an intertwining map. o 


We now turn to the proof of the Wigner-Eckart Theorem. The ingredients 
are Schur’s Lemma, Lemma D.6, and Theorem D.1. The part of Theorem D.1 
that we need is that each irreducible representation occurring in the decom- 
position of Vm © Vn occurs only once. 
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Proof. Let P, denote the orthogonal projection operator from V onto W,. 
Since the action of so(3) on V is unitary and W; is an invariant subspace, it 
is not hard to show that P, commutes with the action of so(3) (i.e., that P, 
commutes with J,, k = 1,2,3). Now, define a linear map Ay : W2 > W, by 
the formula 

A,w' =P, A,w', w © Wy. 


Since A is a vector operator and P; commutes with each Jk, it is not hard 
to show that the operators Ax satisfy [Jx, All = Soay Ekim Am. This means 
that A is a “vector operator from Wz to W1.” Specifically, let A be the unique 
linear map from R3 into Hom(W2, W1) such that A(ex) = Ag, k = 1,2,3. 
Then, A is an intertwining map for the action of so(3). We can extend A by 
complex linearity to an intertwining map of C? into Hom(W2, W1), where C? 
(the complexification of the standard representation) is a three-dimensional 
irreducible representation of so(3), hence isomorphic to V2. 

Now, Wj and W; are irreducible representations of so(3), and so they are 
isomorphic to Vm and Vp, respectively, for some positive integers m and n. 
Thus, by Lemma D.6, Hom(W2, W1) is isomorphic to Vm ® Vn as a represen- 
tation of so(3). Then, by Theorem D.1, Hom(W2, W1) S Vm Vn decomposes 
as a direct sum of irreducible representations of so(3) in such a way that the 
three-dimensional representation Vz = C? occurs at most once. If V2 does not 
appear in the decomposition of Vm ® Vn, then by Schur’s Lemma, A must be 
the zero map. If V2 does occur in the decomposition, then by Schur’s Lemma, 
A is determined up to a constant by the fact that it is an intertwining map. 
Thus, if A # 0 and if B is defined by analogy to A, then B must be a constant 
multiple of A; that is, if A Æ 0, then there is a constant c such that By, = cAg, 
k = 1,2,3. 

We are now almost done. After all, the orthogonal projection operator 
P, : V — W; is self-adjoint, as is easily verified. Thus, for all w € W, and 
w € Wo, 


(w, A,w’) = (w, P Akw’) = (Piw, Aw’) = (w, Aw’), (D.9) 


and similarly for Br. Therefore, if (w, Agw’) is non-zero for some w and w’, 
then A Æ 0 and, so, By = cAg. This, by (D.9), implies the Wigner—Eckart 
Theorem. o 


D.3 More on Vector Operators 


Let us now look more closely at the concept of a vector operator. Suppose at 
first that associated to the representation m of the Lie algebra so(3) there is 
a representation IT of the group SO(3). (Since SO(3) is not simply connected, 
this will not always be the case.) Then, there is a representation II of SO(3) 
associated to representation 7 of so(3) acting on End(V), namely 
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TI(R)(A) = TI(R)A(R)-!, R € SO(3), A € End(V); 


that is, II(R) = Ady R); this is simply the group counterpart to expression 
for 7, #(X) = ady(x). If Y : R? > End(V) is a linear map, then % is an 
intertwining map for the so(3) actions on R? and End(V) if and only if it 
is an intertwining map for the associated SO(3) actions. Thus, w is a vector 
operator if and only if 


(Rv) = I(R)y(v)I(R) (D.10) 


for all R € SO(3) and v € R®. This condition is, in physics terminology, the 
“finite” counterpart to the “infinitesimal” condition (D.5). 

In quantum mechanics, one considers a quantum Hilbert space V equipped 
with “angular momentum” operators Jı, J2, and Jz which satisfy the so(3) 
commutation relations and thus give rise to a representation m of so(3). (In 
most cases, V is infinite dimensional, but this should be regarded as a tech- 
nicality; subject to certain technical conditions, the Wigner—Eckart theorem 
remains true if V is infinite dimensional, provided that W C V is finite dimen- 
sional and irreducible.) Assuming there is an associated representation II of 
SO(3) on V, an operator of the form II(R) represents the action of a rotation 
R of R? on the space of quantum states. One also has “position” operators 
Xı, X2, and X; that describe (in the quantum realm) the x1-, £2-, and z3- 
components of the position of a particle. One also expects quantum mechanics 
to be a rotationally invariant theory, which means that the x,-component of 
the position should not be fundamentally different from the r2-component of 
the position. What that means in this context is that if R is a rotation that 
takes e; to e2, then we should have 


Xə = I(R)X (R). 


This says that Xz differs from X, simply by the action (R) on End(V). 
More generally, if v is any unit vector in R3, the operator corresponding to 
the v-component of the position will be X -v = Xivı + Xov2 + X3v3 and we 
expect that 

X - (Rv) = T(R)(X-v)M(R)~*; 


that is, we expect that X should be a vector operator. Other common vec- 
tor operators in quantum mechanics are the angular momentum operators 
themselves and the momentum operators. 
Now, the discussion in the two previous paragraphs assumes that there is 
a representation II of SO(3) associated to the representation m of so(3). In 
practice, this may not be the case, for example, when dealing with particles 
with spin 5. However, we can use the isomorphism ¢ : su(2) —> so(3) (Section 
4.9) to turn 7 into a representation o of su(2). Then, because SU(2) is simply 
connected, we can form a SU(2) representation ©. Assume that X has the 
property that 
X(-I) = +I. (D.11) 
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If X is irreducible, then by Schur’s Lemma, =(—J) must be a multiple of the 
identity, and since (—J)? = J, this multiple must be +7. So, (D.11) automat- 
ically holds when © (or equivalently 7) is irreducible. However, in practice, 
we do not want to assume that 7 is irreducible and, so, (D.11) is a nontrivial 
restriction. 

If (D.11) holds, then we can “define” a representation I of SO(3) by setting 


I(R) = X(U), (D.12) 


where U € SU(2) is chosen so that ®(U) = R. (Here ® : SU(2) + SO(3) is the 
2-to-1 group homomorphism corresponding to the Lie algebra homomorphism 
¢.) For any R € SO(3), such a U exists and is unique up to a sign, and, so, 
(D.12) serves to define II(R) up to a sign. We think of II(R) as actually 
representing either of the two operators (U) and H(—U). If X(—I) = —I, 
then there is no consistent way to choose the signs in II and, so, H is a 
“double-valued representation” of SO(3). However, even if I is double-valued, 
the associated representation II is single-valued, since 


M(R)AI(R)™ = (—T(R))A(—-T(R))~*. 


So, the “global” form (D.10) of the definition of a vector operator makes 
sense provided that (D.11) holds, even if II does not exist as a single-valued 
representation of SO(3). 

A “representation” of a group that is defined only up to constant multiples 
of the identity is what is called a projective representation. More precisely, 
a projective representation of G is a continuous homomorphism of G into the 
quotient group GL(V)/C*I, where C*I is the group of all nonzero constant 
multiples of the identity, which is a normal subgroup of GL(V). In quantum 
mechanics, a constant multiple of a vector v € V represents the same physical 
state as v itself, so a projective representation of SO(3) still gives a well-defined 
action of SO(3) on the set of physical states. 


E 


Computing Fundamental Groups of Matrix Lie 
Groups 


E.1 The Fundamental Group 


Let X be any path-connected Hausdorff topological space and let xo be a fixed 
point in X (the “basepoint”). We consider loops in X based at Zo (ie., 
continuous maps ¥ : [0,1] + X with the property that y(0) = y(1) = z). The 
choice of the basepoint makes no substantive difference to the constructions 
that follow. From now on, “loop” will mean “loop based at xo.” Ultimately, 
we are interested in the case that X is a matrix Lie group. 

If yı and +2 are two loops, then we define the concatenation of yı and 
Y2 to be the loop 71 - ye given by 


i _ falt, Oo<t<4 
yı ZORRI sates 


that is, yı - Y2 traverses 7 as t goes from 0 to 1/2 and then traverses y2 as t 
goes from 1/2 to 1. (At t = 1/2, we are at the basepoint.) 

Two loops 7, and 72 are said to be homotopic if one can be “continuously 
deformed” into the other. More precisely, this means that there exists a con- 
tinuous map A : [0, 1] x [0,1] + G such that A(0,t) = yı (t) and A(1,t) = y(t) 
for all ¢ € [0,1] and such that A(s,0) = A(s,1) = zo for all s € [0,1]. One 
should think of A(s,t) as a family of loops parameterized by s. One important 
example of homotopic loops is the case when %2 is simply a reparameterization 
of yı with the same orientation. Suppose f : [0,1] — [0,1] is a continuous, 
nondecreasing function with f(0) = 0 and f(1) = 1. Then, for any loop 71, 
the loop q2 given by y2(t) = 71(f(t)) will be homotopic to y1. 

A loop is said to be null homotopic if it is homotopic to the constant 
loop (i.e., the loop 7° for which 7°(t) = zo for all t € [0,1)). If all loops 
in X are null homotopic, then X is said to be simply connected. (This is 
not quite identical to but is equivalent to the definition given in Section 1.5.) 
The notion of homotopy is an equivalence relation. This means (1) that every 
loop is homotopic to itself, (2) that +, is homotopic to %2 if and only if y2 is 


332 E Computing Fundamental Groups of Matrix Lie Groups 


homotopic to yi, and (3) that if yı is homotopic to y2 and y2 is homotopic to 
73, then yı is homotopic to 43. 

The homotopy class of a loop y is the set of all loops that are homotopic 
to y. Each loop belongs to one and only one equivalence class. The concate- 
nation operation “respects homotopy.” This means that if y1 is homotopic to 
‘2 and 6; is homotopic to 69, then 7; - 6, is homotopic to 7 - 6g. As a result, 
it makes sense to define the concatenation operation on equivalence classes. 

The fundamental group of X, denoted 7(X), is the set of homotopy 
classes equipped with the operation of concatenation. We need to check that 
this operation indeed makes the set of homotopy classes into a group. So, 
we must check associativity, the existence of an identity, and the existence of 
inverses. For associativity, we note that although (71 -7y2)-73 is not the same as 
1+ (Y2°73), the second of these two loops is a reparameterization of the first. 
Thus, the homotopy class of (7 - y2) - y3 is the same as the homotopy class 
of 71 - (Y2 : Y3) and, so, concatenation is associative at the level of homotopy 
classes. The identity in the fundamental group is the constant loop 7°. This 
is not an identity at the level of loops but is at the level of homotopy classes; 
that is, y: ° and 7° - y are not equal to y, but they are both homotopic to 
y, since both are reparameterizations of y. Finally, for inverses, the inverse to 
a homotopy class [y] is the homotopy class [7’] where 7/(t) = y(1 — t). It can 
be shown that y- y’ and 7-7 are both null homotopic. A topological space 
X is simply connected precisely if its fundamental group is the trivial group 
(consisting of just the identity element). 

Some standard examples of fundamental groups are as follows: R” is simply 
connected for all n, S” is simply connected for n > 2, and the fundamental 
group of S! is isomorphic to Z. For more information on fundamental groups 
and related topics, see Munkres (1975) or Hatcher (2002). 


E.2 The Universal Cover 


If X is a sufficiently nice topological space (the precise conditions need not 
concern us here except to say that a connected matrix Lie group is nice 
enough), then one can construct something called a universal cover of X. 
This is a simply-connected topological space X together with a continuous 
map P : X > X with the following property: Each z € X has a neighbor- 
hood U such that P~!(U) is a disjoint union of open sets Va such that P maps 
each V, homeomorphically onto U. The map P is called the projection map. 
If we think of P as projecting X “down” onto X, then P~1(U) is the set of 
points in X lying above U. This set consists of a family of disjoint open sets 
Va, each of which is identical to (i.e., homeomorphic to) U. 

If X is a Lie group G, then the universal cover G can be given the structure 
of a Lie group in such a way that the projection map P : G > G is a smooth 
homomorphism. In that case G is called a universal covering group of G 
and the associated Lie algebra map p : g — g is a Lie algebra isomorphism. We 


E.3 Fundamental Groups of Compact Lie Groups I 333 


have the following uniqueness result for universal covering groups. Suppose 
that Hı and Hz are simply-connected Lie groups and P, : Hı > G and 
P : Hə — G are Lie group homomorphisms such that the associated Lie 
algebra maps pı : hı — g and pe : h2 —> g are isomorphisms. Then, there 
exists a Lie group isomorphism ©: Hı — H> such that P,(h) = P2(®(h)) for 
all h in Hı. With this uniqueness result in mind (compare Corollary 3.8), we 
speak of the universal covering group of G. 

If (G, P) is the universal covering group of G, then the fundamental group 
of G has the property that 


11(G) S ker P. 


So, if one can find the universal covering group and associated projection 
map explicitly, then this allows one to compute the fundamental group of G. 
(Usually, there is no direct way of producing the universal cover and so one 
has to compute 71(G) by other means.) The kernel of any homomorphism 
is a closed normal subgroup. Since the Lie algebra map associated to P is 
an isomorphism, the kernel of P must be discrete and, therefore, a subgroup 
of the center of G (Exercise 11 from Chapter 1). This shows that 7(G) is 
commutative for any Lie group G. (This can also be shown directly.) For 
general topological spaces, the fundamental group may be noncommutative. 

Even if G is a matrix Lie group, G may not be a matrix Lie group, as the 
example G = SL(n; R) in Appendix C demonstrates. Even if G does happen to 
be a matrix Lie group, there is no canonical way to represent G as a matrix Lie 
group. This shows the advantage of working with general Lie groups instead 
of just matrix Lie groups. On the other hand, if G is a (connected) matrix 
Lie group and one can somehow find a simply-connected matrix Lie group H 
and a homomorphism P : H — G such that the associated Lie algebra map 
is an isomorphism, then (H, P) is the universal covering group of G and one 
never needs to concern oneself with nonmatrix Lie groups. For example, if 
G is SO(3), then we can take H to be SU(2) and P to be the map given in 
Section 1.6, whose kernel is {7, —I}. This shows that 71(SO(3)) is isomorphic 
to Z/2. 


E.3 Fundamental Groups of Compact Lie Groups I 


For any nice topological space, one can define higher homotopy groups 
Tk(X), k = 1,2,3,.... The precise definition need not concern us here. The 
relevant points are that 7 (X) is the fundamental group as defined in the 
previous sections and that 7;(X) is trivial (i.e., contains only the identity) if 
and only if every continuous map of the k-sphere S* into X can be shrunk 
continuously to a point. We will make use of the following standard topological 
result: 


Proposition E.1. For a d-sphere S¢, 7,(S%) is trivial if k < d. 
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This result is plausible because for k < d, the image of a “typical” con- 
tinuous map of S% into S¢ will not be all of S?. However, if the image of the 
map omits even one point in S¢, then we can remove that point and what is 
left of the sphere can be contracted continuously to a point. 


Definition E.2. Suppose that B and F are Hausdorff topological spaces. A 
fiber bundle with base B and fiber F is a Hausdorff topological space X to- 
gether with a continuous map p : X —> B, called the projection map, having 
the following properties. First, for each b in B, the preimage of p~1(b) of b in 
X is homeomorphic to F. Second, given any b in B, there is a neighborhood 
U of b such that p~1(U) is homeomorphic to U x F in such a way that the 
projection map is simply projection onto the first factor. 


The second condition can be stated more pedantically as follows. For each 
b € B, there should exist a neighborhood U of B and a homeomorphism 
® of p-!(U) with U x F having the property that p(x) = pi(®(x)), where 
pı : U x F + U is the map p;(u, f) = u. The sets of the form p~'(b) are 
called the fibers of the fiber bundle. 

The simplest sort of fiber bundle is the product space X = B x F, with 
the projection map being simply the projection onto the first factor. Such a 
fiber bundle is called trivial. The second condition in the definition of a fiber 
bundle is called local triviality and it says that any fiber bundle must look 
locally like a trivial bundle. In general, X need not be globally homeomorphic 
to B x F. 

If X were a trivial fiber bundle, then the fundamental group of X would 
be simply the product of the fundamental group of the base B and the funda- 
mental group of the fiber F. In particular, if X were a trivial fiber bundle and 
nı(B) were trivial, then mı(X) would be isomorphic to 7 (F’). The following 
result says that if 71(B) and 72(B) are trivial, then the same conclusion holds 
even if X is nontrivial. 


Theorem E.3. Suppose that X is a fiber bundle with base B and fiber F. If 
m™1(B) and m2(B) are trivial (i.e., contain only the identity), then 1(X) is 
isomorphic to mı (F). 


This result is a consequence of the long exact sequence of homotopy groups 
for fiber bundles. 


Theorem E.4. Suppose G acts smoothly and transitively on a smooth man- 
ifold M and let mo be an arbitrary point in M. Define p: G > M by 
p(g) =g: mo. Let H be the stabilizer of the point mp —that is, 


H = {g € G| g : Mmo = mo}. 


Then, G is a fiber bundle with base M and fiber H. 
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This is a standard result in Lie group theory. See, for example, Warner 
(1983). 


Example 1: SO(n) acting on the unit sphere in R”. We think of the (n—1)- 
dimensional sphere $”~1 as the set of vectors of length 1 inside R”. If ||x||? = 
(x, 2) = 1, then ||Rz||? = (Ra, Rx) = 1 for all R in SO(n). Thus, SO(n) acts 
on S”~! and the action is easily seen to be smooth. It it not hard to show that 
SO(n) acts transitively on S"~!, for all n > 2. Let us take as our basepoint 
the vector e; = (1,0,...,0) in S"~!. If R € SO(n) and Re; = e1, then R 
must map the orthogonal complement of e1, namely the span of e€9,...,€n, 
into itself. So, R must be of the form 


10 
r= (on): 


with R’ € SO(n — 1). Thus, the stabilizer of e, is (isomorphic to) SO(n — 1). 

Thus, by Theorem E.4, SO(n) is a fiber bundle with base S"~', fiber 
SO(n — 1), and projection map given by R -> Re. Suppose that n is at 
least 4, so that n — 1 is at least 3. Then, by Proposition E.1, 7,($"~!) and 
m2(S"~") are trivial and, so, Theorem E.3 tells us that 7: (SO(n)) is isomorphic 
to 7(SO(n — 1)), for n > 4. Thus, 71(SO(n)) is isomorphic to 7(SO(3)) for 
alln > 4. 

This method does not tell us what 7(SO(3)) is, but this can be computed 
“by hand” by showing that SO(3) is homeomorphic to RP? (Section 1.5), by 
showing that SO(3) = SU(2)/{I, —I}, or by the method of the next section. 
In any case the conclusion is that 7(SO(3)) = Z/2. We thus conclude that 
™(SO(n)) S Z/2 for all n > 3. 

Meanwhile, if n = 3, then 72(S"~') is not trivial and, so, Theorem E.3 
does not tell us that 7,(SO(2)) is the same as 71(SO(3)). Indeed, SO(2) is 
diffeomorphic to S' and, so, 7(SO(2)) © Z. 


Example 2: SU(n) acting on unit sphere in C”. The group SU(n) acts on 
the unit sphere $2”—! in C” = R?”. The action is transitive for all n > 2 
and the stabilizer of a point is SU(n — 1). So, SU(n) is a fiber bundle with 
base S?”~1 and fiber SU(n — 1). As long as n is at least 2, then the sphere 
in question will have dimension at least 3 and, so, 7($?"71) and m2(S2"7!) 
will be trivial. We conclude, therefore, that m(SU(n)) © 71(SU(n — 1)) for 
all n > 2. In particular, 7(SU(2)) is isomorphic to 7,(SU(1)), and since 
SU(1) is just a point, its fundamental group is trivial. So, we conclude (as we 
already knew) that SU(2) is simply connected. Applying this argument then 
for n = 3,4,... shows that SU(n) is simply connected for all n. 


Example 3: U(n) acting on unit sphere in C”. The group U(n) also acts 
on S21, The same argument as for SU(n) shows that all the U(n)’s have 
the same fundamental group. However, U(1) is diffeomorphic to St and, so, 
m(U(1)) S Z. Thus, 71(U(n)) S Z for all n. 
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Example 4: Sp(n) acting on unit sphere in C?". The group Sp(n) := 
Sp(n; C) NU(2n) is contained in U(2n) and therefore preserves the unit sphere 
S4"-1 in C2”. In fact, it acts transitively on this sphere, and the stabilizer of a 
point is isomorphic to Sp(n—1). Already for n = 2, this sphere has dimension 
7, so we conclude that 7(Sp(n)) S 71(Sp(n — 1)) for all n > 2. However, 
Sp(1; C) = SL(2;C) and so Sp(1) = SL(2;C) N U(2) = SU(2), which is simply 
connected. Thus, Sp(n) is simply connected for all n > 1. 


These examples complete the table of fundamental groups of compact 
groups given in Chapter 1. 


E.4 Fundamental Groups of Compact Lie Groups II 


We consider now a more algebraic approach to computing 71(K), where K is 
a compact connected Lie group. We also discuss in this section a method of 
computing the center of K, which is not really relevant to the computation of 
mı but which uses the same constructions. 

Let K be a connected compact Lie group with Lie algebra t. Although 
we will use the machinery of roots and weights, it is not necessary here to 
assume that € is semisimple. (Recall that the Lie algebra of a compact group 
is automatically reductive (i.e., the direct sum of a semisimple Lie algebra and 
a commutative Lie algebra).) We choose once and for all an inner product on 
€ that is invariant under the adjoint action of K and a maximal commutative 
subalgebra t of £. Then, let T be the connected Lie subgroup of K whose Lie 
algebra is t. Then T is a closed subgroup of K and is called a maximal torus 
in K. It is customary in the setting of compact groups to work with the real 
roots (as in Section 7.4). The real roots are the nonzero elements a of t for 
which there exists a nonzero X in £c such that 


[H, X] = ila, H)X 


for all H in t. Although £ (or equivalently c) is not assumed semisimple, all 
of the properties of roots from the semisimple case hold except that the roots 
do not necessarily span t. 

For each root a, we have the root spaces gq and g—a inside the complexified 
Lie algebra tc. These are both one dimensional and they, together with their 
commutator, span a three-dimensional subalgebra s® of & that is isomorphic 
to sl(2;C). An examination of the proof of this result (Theorem 6.20) will show 
that the intersection of s* with the real Lie algebra € is a three-dimensional 
subalgebra sg of £ that is isomorphic to su(2). Specifically, if Xa, Yo, and Ha 
are as in Theorem 6.20, then we set 
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Ja = oe. 

[A 

1 
Ka = 9 (Xa = Ya), 
La = 5 (Ka + Ya). 


Then, Ja € t and Ka and La are in and they have the usual su(2) commu- 
tation relations: [Ja, Ka] = La, [Ka, La] = Ja, and [La, Ja] = Ka. We will 
call the element Ja the real co-root corresponding to the root a. Then, the 
real roots and the real co-roots are related in the same way as the roots and 
co-roots in Chapter 6, namely 


a 
Ja = 2——. E.1 
ia) (E.1) 
Definition E.5. Let I denote the set of all H in t for which exp(27H) = I. 
We callT the kernel of the exponential mapping (for t). Let J denote the 
set of all linear combinations of the real co-roots Ja with integer coefficients. 


It is a slight abuse of terminology to call I the kernel of the exponential 
mapping, since the map it is the kernel of is actually the exponential mapping 
composed with multiplication by 27 in t. 


Proposition E.6. If a is a real root and Ja the associated real co-root, then 
exp(27J,,) =I. Thus, J CT. 


Proof. The proposition holds in the SU(2) case, since in that case, Ja is the 


matrix 
i 0 
€ À : (E.2) 


For the general case, we define a homomorphism ¢ from su(2) into ¢ by map- 
ping the usual basis for su(2) to the elements Ja, Ka, and La. Since SU(2) is 
simply connected, Theorem 3.7 tells us that there is a Lie group homomor- 
phism ®, : SU(2) > K with the property that ®(exp X) = exp (X) for all 
X €su(2). Applying this with X equal to the matrix Ja in (E.2), we see that 


exp(2r Ja) = exp(@(27X)) = ®&(exp(27X)) 
=O 7) =f. 


We are now ready to state the main theorem. 


Theorem E.7. Let K be a compact connected Lie group with Lie algebra £ 
and let t be a mazimal commutative subalgebra of £. Let T be the kernel of 
the exponential mapping for t and let J be the lattice generated by the real 
co-roots. Then, 

m(K) ST /J. 
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Let us see how this works out in the case of SU(2) and SO(3). For the case 
of SU(2), we may consider the maximal commutative subalgebra t consisting 
of the diagonal matrices inside su(2). Then, there are two roots (negatives of 
each other), with corresponding root vectors X and Y, denoted a and —a. 
The real co-root corresponding to a is the element 


i 0 
Te Gs 


and the real co-root corresponding to —a is —J,. We now observe that 


2rit 
antl, _ [e77 0 
e Pad 0 ew emit > 


This means that the kernel of the exponential mapping is precisely the set of 
integer multiples of Ja. So, in this case, IT and J coincide and we conclude 
(again) that SU(2) is simply connected. 

Meanwhile in SO(3), we use the isomorphism ¢ of su(2) and so(3) described 
in Section 4.9. We take our maximal commutative subalgebra as the image of 
t under ¢, in which case the real co-roots will be +¢(H.), where 


00 0 
(Ha) = | 00-2 
02 0 
Now, we compute that 
1 0 0 
e2ttb(Ha) — | 0 cos(4nt) — sin(4nt) 


( 
0 sin(4nt) cos(4zt) 


We see that I consists of all integer or half-integer multiples of Ha, and, so, 
T/J has two elements and is isomorphic to Z/2. Thus, 7(SO(3)) = Z/2. 

We may also consider the case of U(1), which is compact but not semisim- 
ple. In this case, there are no roots and, therefore, no co-roots. This means 
that J = {0}. However, T is still isomorphic to Z, and, so, 7(U) = Z/{0} = Z. 
In general, if K is compact but not semisimple, then mı(K) will be infinite. 

We have already encountered in Section 7.4 the distinction between alge- 
braically integral elements and analytically integral elements. Theorem E.7 
allows us to understand the relationship between the two sets of integral el- 
ements. So, we consider the set of algebraically integral real elements. 
These are the elements u of t with the property that 


(u,a) 
2 laa) EZ (E.3) 


for all real roots œ. We also consider the set of analytically integral real 
elements. These are the elements u of t with the property that 
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(u,H) EZ 


for all H in t with the property that exp(27H) = I. 

From now, on we omit the word “real” and assume that all roots and 
integral elements are of the real variety. We have the following important 
consequence of Theorem E.7. 


Corollary E.8. The set of algebraically integral elements coincides with the 
set of analytically integral elements if and only if K is simply connected. 


In general, the analytically integral elements are those that can occur as 
weights of representations of the group K, whereas the algebraically integral 
elements are those that occur as weights of representations of the Lie algebra 
€. In the simply-connected case, the representations of K are in one-to-one 
correspondence with the representations of ¢, and, correspondingly, the an- 
alytically integral elements are (in that case) the same as the algebraically 
integral elements. 


Proof. The analytically integral elements are by definition those elements of 
t whose inner product with each element of I is an integer; that is, the lattice 
of analytically integral elements is the “dual lattice” to the kernel of the 
exponential mapping. Meanwhile, comparing the defining condition (E.3) for 
the algebraically integral elements with the formula (E.1) for the co-roots, 
we see that the algebraically integral elements are those whose inner product 
with each co-root is an integer, and, hence, whose inner product with each 
element of J is an integer. (This means that the lattice of algebraically integral 
elements is the “dual lattice” to the lattice generated by the co-roots.) If K 
is simply connected, then by Theorem E.7, J and F coincide and, so, the 
algebraically integral and analytically integral elements coincide. 

If K is not simply connected then, by Theorem E.7, J must be a proper 
subset of I. Now, if a,,...,@, are the positive simple roots (where r is the 
dimension of the part of t spanned by the a’s), then Jg,,..., Ja, are linearly 
independent (over R), and the elements of J are precisely the linear combi- 
nations of these elements with integer coefficients. Then, consider an element 
H ofT that is not in J. The (unique) expansion of H in terms of Jo,,..., Ja, 
must have some noninteger coefficients. Then, at least one of the fundamental 
weights (which are algebraically integral elements) will take noninteger val- 
ues on IĮ and will, therefore, be an algebraically integral element that is not 
analytically integral. oO 


We have considered so far two subsets of t: the kernel T of the exponential 
mapping and the set J of all integer linear combinations of the co-roots. There 
is one other subset of t that is relevant, namely the set 


A={H € t| (a, H) € Z, for all real roots a}. 


It can be shown that every element of the kernel of the exponential mapping 
has this property, so that we have the inclusions 
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Theorem E.9. Let K be a connected compact group with Lie algebra € and 
let t be a maximal commutative subalgebra of £. Let A denote the set of H € t 
such that (a, H) is an integer for all real roots a and let T denote the kernel 
of the exponential mapping. Then, T C A and 


A/T = Z(K), 
where Z(K) denotes the center of K. 


Note here that K does not have to be simply connected or semisimple. If 
K is commutative, then there are no roots and, so, in that case, A is simply 
all of t. In that case, we have A/T = t/T = K (i.e., K = Z(K)). If K is not 
semisimple, then the center of K will be at least one dimensional; for example, 
the center of U(n) is the set of matrices of the form et? I and has dimension 
one. 

Let us see how Theorem E.9 compares to Theorem 8.30, which applies in 
the case that K is simply connected. From Theorem E.7, we have that when 
K is simply connected, I is the same as J. So, Theorem E.9 tells us that, in the 
simply-connected case, Z(K) is isomorphic to A/J. Meanwhile, Theorem 8.30 
says that Z(K) is isomorphic to the set of (algebraically) integral elements 
modulo the root lattice. 

Let us now see how to reconcile these two results. We have already said 
that a lattice [ inside a finite-dimensional real inner-product space E is the 
set of integer linear combinations of a basis for Æ. The dual lattice I” is 
defined as 

I” = {H e€ E| (y, H) € Z for ally ET}. 


This is, again, a lattice; if y,,...,7 is the generating set for I’, then a gener- 
ating set for I” is the set y!,..., y” satisfying 
(15,7°) = Six (E.4) 


(To show that such y*’s do exist, take y* inside the one-dimensional orthogo- 
nal complement of the span of {7; } tk .) The notion of duality is symmetric: If 
T” is the dual lattice to T, then T is the dual lattice to I”. This is a consequence 
of the symmetry of (E.4) between the ¥;’s and the *’s. 

If T} and T% are lattices with Ty C T32, then we may regard both lattices 
as commutative groups under vector addition and form the quotient group 
T2/T,. This will be a finite commutative group. We make use of the following 
elementary result from lattice theory. (See, for example, Proposition 1.3.8 in 
Martinet (2003).) 


Proposition E.10. [fT CT2, then r, CT} and 


T/r = 1 /Ts. 
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The lattice of algebraically integral elements is precisely the dual lattice 
to the lattice J generated by the co-roots. After all, if (u, Ha) is an integer for 
each co-root a, then (u, H) is an integer whenever H is an integer combination 
of co-roots. Meanwhile, A is precisely the dual to the lattice generated by the 
roots, and, thus, by the symmetry of the duality relationship, the dual lattice 
to A is the root lattice. So, 


J’ = algebraically integral elements, 


A’ = root lattice. 
Proposition E.10 then tells us that 
A/J & (algebraically integral elements) /(root lattice). 


Thus, we see that Theorem 8.30 and Theorem E.9 have the same content in 
the simply-connected case. 

Note that “dual” to the inclusions J C T C A, we have inclusions in the 
reverse directions: 


Root lattice C analytically integral elements 


C algebraically integral elements. 


We conclude this section with a brief discussion of how one goes about 
proving Theorem E.7. We consider the inclusion map i: T —> K. This map 
induces a homomorphism i, : 7(T) + mı(K). Concretely, this means simply 
that every loop in T is also a loop in K and that two loops in T that are 
homotopic in T are certainly homotopic in K. The reverse is not true: Two 
loops in T that are not homotopic in T may, nevertheless, be homotopic in 
K. Thus, the map ix : 7(T) > mı(K) may not be injective. 

Although ix : mı(T) > 7(K) is typically not injective, it can be shown 
that it is surjective. This says that every loop in K (say, based at the identity) 
is homotopic to a loop in T. One way to prove this is to use differential 
geometry. We choose on K a Riemannian metric that is invariant under both 
the left and the right action of K. Such a metric exists because K is compact. 
Then, it is a standard calculation (Helgason (1978)) that with respect to such 
a metric, the geodesics through the identity are precisely the maps of the 
form t + exp(tX), for X in the Lie algebra £. It is also a standard result 
from differential geometry that every homotopy class of loops in a compact 
Riemannian manifold contains a geodesic. So, in each homotopy class of loops 
based at the identity, we can find one of the form t + exp(tX), where for this 
curve to be a loop, it must be that exp(X) = I. Now, it is a standard result 
about compact groups that every element of the Lie algebra is conjugate to 
an element of t. So, there exists g in K such that gXg7! € t. Since K is 
connected, we can find a path g(s) connecting the identity to g, and then the 
one-parameter family of loops 


(s,t) => g(s)e*g(s)~* 


342 E Computing Fundamental Groups of Matrix Lie Groups 


is a homotopy of the loop t + exp(tX) with the loop gexp(tX)g7! = 
exp(t gXg~‘), which lies in T. 

We conclude, then, that every loop in K is homotopic to a loop in T. This 
means that the map ią : mı(T) —> mı(K) is surjective. It remains, then, to 
compute 7ı(T) and to compute the kernel i, and then we will have m, (K) = 
m™71(T)/ ker ią. Fortunately, T is just a torus and its fundamental group is easily 
shown to be isomorphic to I. Specifically, if H €T (ie., exp(27H) = I), then 
we may consider the loop t — exp(2atH) in T. It is not hard to show that 
every loop in T is homotopic to one and only one loop of this form, essentially 
because T = t/T and t is simply connected. We must then determine which 
loops of this form are homotopically trivial in K. 

If H is equal to one of the real co-roots Ja, then the loop t > exp(2ztJ,) is 
homotopically trivial. To see this, we again consider the homomorphism ©, : 
SU(2) —> K described earlier in this section. Then, the loop t > exp(2rtJa) is 
the image of the loop t > exp(27tJ) in SU(2). However, SU(2) is simply con- 
nected and, so, the loop t > exp(2rtJ) is homotopically trivial. This means 
that the loop t > exp(2rtJa) is the continuous image of a homotopically 
trivial loop and is therefore homotopically trivial. This argument shows that 
the kernel of i, contains the lattice J. To show that there is nothing else in 
the kernel requires some additional effort—see Brécker and tom Dieck (1985). 


E.5 Fundamental Groups of Noncompact Lie Groups 


In Section 1.7, we described the polar decomposition for SL(n;R), namely 
that every matrix A in SL(n; R) can be decomposed uniquely as 


A= RP, 


with R in SO(n) and P a positive symmetric matrix with real entries and 
determinant one. We can carry this one step further and write 


P=e%*, 


where X is a real symmetric matrix with trace zero. (To obtain X from P, 
diagonalize P, and then take the logarithm of the eigenvalues, which must be 
real and positive and whose product is equal to 1.) Let p denote the subspace 
of sl(n; R) consisting of symmetric matrices (with trace zero). It can be shown 
that the map 
SO(n) x p>SL(n; R), 
(R, X) + Re* 


is a homeomorphism. Thus, 


m1 (SL(n; R)) = mı (SO(n)) x ™1(p) = m1(SO(n)), 
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since p & Rf is simply connected. (Here, d = dimp.) Thus, the computation 
of the fundamental group of the noncompact group SL(n;R) can be reduced 
to the computation of the fundamental group of the compact group SO(n), 
which can be done by the methods of the previous two sections. 

Similar arguments apply to the other noncompact matrix Lie groups G 
that we have studied, provided that G is semisimple or reductive. In each 
such case, the Lie algebra g of G will decompose as g = € + p, where € is 
the space of skew-self-adjoint elements in g and p is the space of self-adjoint 
elements in g. Then, the subgroup K of G with Lie algebra £ will be a compact 
subgroup of U(n) (called a maximal compact subgroup) and we will get 
a decomposition g = xz exp X, with x € K and X € p. This shows that G is 
homeomorphic to K x p and, therefore, that mı(G) & mı(K). The following 
list shows the resulting isomorphisms. Once the fundamental groups of the 
compact groups have been computed using the results of the previous two 
sections, this allows us to fill in the table of fundamental groups of noncompact 
groups from Chapter 1. 


mı(GL(n; R)*) S mı (SO(n)), 
m™(GL(n; C)) S mı (U(n)), 
mı (SL(n; C)) S mı (SU(n)), 
mı (SO(n; C)) S mı (SO(n)), 
nı(SOe(n, 1)) S mı (SO(n)), 
Tı(Sp(n; R)) S mı (U(n)), 
Tı (Sp(n; C)) S mı (Sp(n)). 
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